metadata
license: apache-2.0
base_model:
- AiLab-IMCS-UL/whisper-large-v3-lv-late-cv19
pipeline_tag: automatic-speech-recognition
General-purpose Latgalian ASR model
This is a fine-tuned whisper-large-v3 model for Latgalian, trained by AiLab.lv using two general-purpose speech datasets:
- the Latgalian part of Common Voice 20.0,
- the Corpus of Contemporary Latgalian Speech MuLaR.
Training
As a base model, we used a previously fine-tuned ASR model for Latvian, and continued to fine-tune it for Latgalian. The fine-tuning was done using the Hugging Face Transformers library.
Training data | Hours |
---|---|
Latgalian Common Voice 20.0 train set (a VW split) | 22.9 |
Corpus of Contemporary Latgalian Speech (MuLaR) train set | 17.3 |
Total | 40.2 |
Evaluation
Testing data | WER |
---|---|
Latgalian CV 20.0 test set (1.5 hours) | 9.1 |
MuLaR test set (1.6 hours) | 25.7 |
NB! The MuLaR corpus contains transcriptions that generally do not follow the rules of the standard Latgalian orthography, in contrast to the Latgalian CV corpus.
Acknowledgements
This work was supported by the EU Recovery and Resilience Facility project Language Technology Initiative (2.3.1.1.i.0/1/22/I/CFLA/002) in synergy with the State Research Programme project "Diversity of Latvian in Time and Space" (VPP-LETONIKA-2021/4-0003).