Problems
Works fine, eventually, even above average quality! A few suggestions, however...
You have a shit ton of dependencies. While this is partly unavoidable due to relying on magic, etc., try to reduce them drastically...
The program as a whole requires numpy no greater than 1.26.4. My traceback indicates this is because of the fast_langdetect library, which was archived last year and obviously won't be updated anymore. Try to circumvent this by including their code in your code or another creative way. You need numpy 2+ support.
Recent warnings from transformers state that a video processor should be saved in "video_preprocessor.json" now but you're still using "preprocessor.json".
The image processor within transformers is being called without the "use_fast" parameter. It's best practice to set use_fast to "True".
I'll post more issues as I continue to experiment with this, but overall nice job.
Hello, thank you very much for your feedback on our model. We’ll definitely take these issues into consideration. However, our current focus is on releasing a better version of the model. You're very welcome to submit a PR to help address these problems in the meantime!