Fine tuning guide
Hi! thanks for the great model, It's the best CTC alignment implementation Ive encountered so far and exceeds the pytorch tutorial one greatly. I wish to fine tune it on our own data. We have a nice dataset of SRTs and Audios ready to go.
Could you please provide a small fine tuning guide/get started please?
Hi, this uses the standard fine-tuning process for wav2vec2 models
Check this
https://huggingface.co/blog/fine-tune-wav2vec2-english
@MahmoudAshraf thanks a lot, will do :)
@MahmoudAshraf
Just a quick follow up, you mention in the model card:
"The model checkpoint uploaded here is a conversion from torchaudio to HF Transformers for the MMS-300M checkpoint trained on forced alignment dataset"
So If I were to use a different checkpoint, facebook/mms-1b-fl102 for example, What exactly is the conversion that you did here?
Conversion here means from pytorch weights format to HF weights format
you can use any model you want directly if it has the suitable vocabulary, but using larger models doesn't necessarily mean better results
@MahmoudAshraf Thanks for your help!