Arabic
#8
by
ABDALLALSWAITI
- opened
I tried Arabic, but about 50% of the words were not pronounced correctly.
This is expected at the current stage—or rather, it’s even beyond our expectations. Arabic is significantly underrepresented in the existing training data. We will continue improving support for lower-performing languages in future updates.
For example It's pretty good for polish, I would say 1/3 of words are not pronounced correctly, but is still pretty understandable and great for such a small and fast model. But if you say it will improve, is this model like training-in-progress checkpoint or will you train separate new versions?
i think its better to make native people prepare their language data s