--- library_name: transformers license: openrail datasets: - alexandrainst/coral language: - da metrics: - wer - cer base_model: - openai/whisper-large-v3 pipeline_tag: automatic-speech-recognition model-index: - name: coral-1-whisper-large results: - task: type: automatic-speech-recognition name: Automatic Speech Recognition dataset: name: CoRal read-aloud type: alexandrainst/coral split: test args: read_aloud metrics: - type: cer value: 4.3% ± 0.2% name: CER - type: wer value: 10.4% ± 0.3% name: WER --- # Whisper-Large v.3 trained on CoRaL release 1 This is a Danish state-of-the-art speech recognition model, trained by [Alvenir](https://www.alvenir.ai/). ## Evaluation Results | Model | Number of parameters | [CoRal](https://huggingface.co/datasets/alexandrainst/coral/viewer/read_aloud/test) CER | [CoRal](https://huggingface.co/datasets/alexandrainst/coral/viewer/read_aloud/test) WER | |:---|---:|---:|---:| | [Alvenir/coral-1-whisper-large](https://huggingface.co/Alvenir/coral-1-whisper-large) | 1540M | **4.3% ± 0.2%** | **10.4% ± 0.3%** | | [alexandrainst/roest-315m](https://huggingface.co/alexandrainst/roest-315m) | 315M | 6.6% ± 0.2% | 17.0% ± 0.4% | | [mhenrichsen/hviske-v2](https://huggingface.co/syvai/hviske-v2) | 1540M | 4.7% ± 0.07% | 11.8% ± 0.3% | | [openai/whisper-large-v3](https://hf.co/openai/whisper-large-v3) | 1540M | 11.4% ± 0.3% | 28.3% ± 0.6% | Results of more models and more datasets can be seen in the [model card for Røst-315m](https://huggingface.co/alexandrainst/roest-315m). ## Model details This is simply the [Whisper Large v.3 model](https://hf.co/openai/whisper-large-v3) trained on the first release of [CoRaL data](https://huggingface.co/datasets/alexandrainst/coral). The model was trained for 30K steps using the configuration from the [CoRaL repository](https://github.com/alexandrainst/coral) by running: ```py python src/scripts/finetune_asr_model.py model=whisper-large max_steps=30000 model.learning_rate=1e-5 ``` ## License Note that the dataset used is licensed under a custom license, adapted from OpenRAIL-M, which allows commercial use with a few restrictions (speech synthesis and biometric identification). See [license](https://huggingface.co/Alvenir/coral-1-whisper-large/blob/main/LICENSE). ## Creators and Funders The CoRal project is funded by the [Danish Innovation Fund](https://innovationsfonden.dk/) and consists of the following partners: - [Alexandra Institute](https://alexandra.dk/) - [University of Copenhagen](https://www.ku.dk/) - [Agency for Digital Government](https://digst.dk/) - [Alvenir](https://www.alvenir.ai/) - [Corti](https://www.corti.ai/) We would like specifically thank Dan Saattrup Nielsen, Alexandra Institute for (among other things) the repository work and Simon Leminen Madsen, Alexandra Institute for modelling work.