CoRal-project
/

roest-wav2vec2-1B-v2

Automatic Speech Recognition

Safetensors

Danish

wav2vec2

Eval Results

Model card Files Files and versions Community

MarieAlvenir commited on Apr 29

Commit

4cb2fb5

1 Parent(s): dff953e

Path corrections

Browse files

Files changed (1) hide show

README.md +11 -11

README.md CHANGED Viewed

@@ -105,7 +105,7 @@ The model was also evaluated on a tentative pre-release of the coral-v2 conversa
 | [CoRal-project/roest-wav2vec2-1B-v2](https://huggingface.co/CoRal-project/roest-wav2vec2-1B-v2)     |                   1B | Read-aloud and conversation |                                                                                                         23.9% |                                                                                                         36.7% |
 | [CoRal-project/roest-wav2vec2-315M-v2](https://huggingface.co/CoRal-project/roest-wav2vec2-315m-v2) |                 315M | Read-aloud and conversation |                                                                                                         24.2% |                                                                                                         37.7% |
 | [CoRal-project/roest-whisper-large-v1](https://huggingface.co/CoRal-project/roest-whisper-large-v1)              |                1540M |                  Read-aloud |                                                                                                          138% |                                                                                                          121% |
-| [alexandrainst/roest-wav2vec2-315M-v1](https://huggingface.co/alexandrainst/roest-315m)             |                 315M |                  Read-aloud |                                                                                                          123% |                                                                                                         80.5% |
 ### Detailed evaluation across demographics on the CoRal test data
 <img src="https://huggingface.co/CoRal-project/roest-wav2vec2-1B-v2/resolve/main/images/wer.png">
@@ -161,8 +161,8 @@ The inclusion of a post-processing language model can affect the performance sig
 | [CoRal-project/roest-wav2vec2-1B-v2](https://huggingface.co/CoRal-project/roest-wav2vec2-1B-v2) |                 1B | Read-aloud and conversation |                                No |                                                                             8.1% ± 0.2% |                                                                            23.9% ± 0.4% |
 | [CoRal-project/roest-wav2vec2-315M-v2](https://huggingface.co/CoRal-project/roest-wav2vec2-315m-v2) |                 315M | Read-aloud and conversation |                               Yes |                                                                         **6.5% ± 0.2%** |                                                                        **16.3% ± 0.4%** |
 | [CoRal-project/roest-wav2vec2-315M-v2](https://huggingface.co/CoRal-project/roest-wav2vec2-315m-v2) |                 315M | Read-aloud and conversation |                                No |                                                                             8.2% ± 0.2% |                                                                            25.1% ± 0.4% |
-| [alexandrainst/roest-wav2vec2-315m-v1](https://huggingface.co/alexandrainst/roest-315m)                   |                 315M |                  Read-aloud |                               Yes |                                                                             6.6% ± 0.2% |                                                                            17.0% ± 0.4% |
-| [alexandrainst/roest-wav2vec2-315m-v1](https://huggingface.co/alexandrainst/roest-315m)                   |                 315M |                  Read-aloud |                                No |                                                                             8.6% ± 0.2% |                                                                            26.3% ± 0.5% |
 ### Performance on Other Datasets
@@ -207,11 +207,11 @@ We would like specifically to thank Dan Saattrup Nielsen, Alexandra Institute fo
 ## Citation
-We will submit a research paper soon, but until then, if you use this model in your research or development, please cite it as follows:
-@misc{roest-wav2vec2-1B-v2,
-  author    = {Marie Juhl Jørgensen, Søren Vejlgaard Holm, Martin Carsten Nielsen, Dan Saattrup Nielsen, Sif Bernstorff Lehmann, Simon Leminen Madsen, Anders Jess Pedersen, Anna Katrine van Zee, Anders Søgaard and Torben Blach},
-  title     = {Roest-wav2vec-1B-v2: A Danish state-of-the-art speech recognition model trained on varied demographics and dialects},
-  year      = {2025},
-  url       = {https://huggingface.co/CoRal-project/roest-wav2vec2-1B-v2},
-}

 | [CoRal-project/roest-wav2vec2-1B-v2](https://huggingface.co/CoRal-project/roest-wav2vec2-1B-v2)     |                   1B | Read-aloud and conversation |                                                                                                         23.9% |                                                                                                         36.7% |
 | [CoRal-project/roest-wav2vec2-315M-v2](https://huggingface.co/CoRal-project/roest-wav2vec2-315m-v2) |                 315M | Read-aloud and conversation |                                                                                                         24.2% |                                                                                                         37.7% |
 | [CoRal-project/roest-whisper-large-v1](https://huggingface.co/CoRal-project/roest-whisper-large-v1)              |                1540M |                  Read-aloud |                                                                                                          138% |                                                                                                          121% |
+| [CoRal-project/roest-wav2vec2-315m-v1](https://huggingface.co/CoRal-project/roest-wav2vec2-315m-v1)             |                 315M |                  Read-aloud |                                                                                                          123% |                                                                                                         80.5% |
 ### Detailed evaluation across demographics on the CoRal test data
 <img src="https://huggingface.co/CoRal-project/roest-wav2vec2-1B-v2/resolve/main/images/wer.png">
 | [CoRal-project/roest-wav2vec2-1B-v2](https://huggingface.co/CoRal-project/roest-wav2vec2-1B-v2) |                 1B | Read-aloud and conversation |                                No |                                                                             8.1% ± 0.2% |                                                                            23.9% ± 0.4% |
 | [CoRal-project/roest-wav2vec2-315M-v2](https://huggingface.co/CoRal-project/roest-wav2vec2-315m-v2) |                 315M | Read-aloud and conversation |                               Yes |                                                                         **6.5% ± 0.2%** |                                                                        **16.3% ± 0.4%** |
 | [CoRal-project/roest-wav2vec2-315M-v2](https://huggingface.co/CoRal-project/roest-wav2vec2-315m-v2) |                 315M | Read-aloud and conversation |                                No |                                                                             8.2% ± 0.2% |                                                                            25.1% ± 0.4% |
+| [CoRal-project/roest-wav2vec2-315m-v1](https://huggingface.co/CoRal-project/roest-wav2vec2-315m-v1)                   |                 315M |                  Read-aloud |                               Yes |                                                                             6.6% ± 0.2% |                                                                            17.0% ± 0.4% |
+| [CoRal-project/roest-wav2vec2-315m-v1](https://huggingface.co/CoRal-project/roest-wav2vec2-315m-v1)                   |                 315M |                  Read-aloud |                                No |                                                                             8.6% ± 0.2% |                                                                            26.3% ± 0.5% |
 ### Performance on Other Datasets
 ## Citation
+´´´
+  @misc{roest-wav2vec2-1B-v2,
+    author    = {Marie Juhl Jørgensen, Søren Vejlgaard Holm, Martin Carsten Nielsen, Dan Saattrup Nielsen, Sif Bernstorff Lehmann, Simon Leminen Madsen, Anders Jess Pedersen, Anna Katrine van Zee, Anders Søgaard and Torben Blach},
+    title     = {Roest-wav2vec-1B-v2: A Danish state-of-the-art speech recognition model trained on varied demographics and dialects},
+    year      = {2025},
+    url       = {https://huggingface.co/CoRal-project/roest-wav2vec2-1B-v2},
+  }
+´´´