Model convergence
#1
by
mzboito
- opened
Hi there! :)
I'm one of the authors of mHuBERT-147. Many thanks a lot for trying out our model!
I noticed that the fine-tuned version you uploaded might not have converged. If you plan to continue working on it and run into issues, feel free to reach out. We're happy to offer advice!
As a tip: for less-resourced languages, we’ve found it important to increase the dropout (e.g., 0.1-0.3) and to train in fp32 rather than mixed precision. These adjustments helped us get better results in similar scenarios.
Best of luck with your experiments!