normundsg's picture
Update README.md
5638421 verified
metadata
license: apache-2.0
base_model:
  - AiLab-IMCS-UL/whisper-large-v3-lv-late-cv19
pipeline_tag: automatic-speech-recognition

General-purpose Latgalian ASR model

This is a fine-tuned whisper-large-v3 model for Latgalian, trained by AiLab.lv using two general-purpose speech datasets:

Training

As a base model, we used a previously fine-tuned ASR model for Latvian, and continued to fine-tune it for Latgalian. The fine-tuning was done using the Hugging Face Transformers library.

Training data Hours
Latgalian Common Voice 20.0 train set (a VW split) 22.9
Corpus of Contemporary Latgalian Speech (MuLaR) train set 17.3
Total 40.2

Evaluation

Testing data WER
Latgalian CV 20.0 test set (1.5 hours) 9.1
MuLaR test set (1.6 hours) 25.7

NB! The MuLaR corpus contains transcriptions that generally do not follow the rules of the standard Latgalian orthography, in contrast to the Latgalian CV corpus.

Acknowledgements

This work was supported by the EU Recovery and Resilience Facility project Language Technology Initiative (2.3.1.1.i.0/1/22/I/CFLA/002) in synergy with the State Research Programme project "Diversity of Latvian in Time and Space" (VPP-LETONIKA-2021/4-0003).