Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,7 @@ model-index:
|
|
15 |
name: Automatic Speech Recognition
|
16 |
type: automatic-speech-recognition
|
17 |
dataset:
|
18 |
-
name: CV Benchmark Catalan Accents
|
19 |
type: projecte-aina/commonvoice_benchmark_catalan_accents
|
20 |
config: ca
|
21 |
split: test
|
@@ -38,7 +38,7 @@ model-index:
|
|
38 |
metrics:
|
39 |
- name: Test WER
|
40 |
type: wer
|
41 |
-
value: 3.
|
42 |
---
|
43 |
# NVIDIA Conformer-Transducer Large (ca-es)
|
44 |
|
@@ -61,16 +61,12 @@ The "stt_ca-es_conformer_transducer_large" is an acoustic model based on ["NVIDI
|
|
61 |
|
62 |
## Model Description
|
63 |
|
64 |
-
This model transcribes speech in lowercase Catalan and Spanish alphabet including spaces, and was Fine-tuned on a Bilingual ca-es dataset comprising of
|
65 |
See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#conformer-transducer) for complete architecture details.
|
66 |
|
67 |
## Intended Uses and Limitations
|
68 |
|
69 |
-
This model can used for Automatic Speech Recognition (ASR) in Catalan and Spanish.
|
70 |
-
|
71 |
-
## How to Get Started with the Model
|
72 |
-
|
73 |
-
To see an updated and functional version of this code, please check our [Notebook](insert notebook link)
|
74 |
|
75 |
### Installation
|
76 |
|
@@ -80,7 +76,7 @@ pip install nemo_toolkit['all']
|
|
80 |
```
|
81 |
|
82 |
### For Inference
|
83 |
-
To transcribe audio in Catalan
|
84 |
|
85 |
|
86 |
```python
|
@@ -95,7 +91,7 @@ print(transcription)
|
|
95 |
|
96 |
### Training data
|
97 |
|
98 |
-
The model was trained on bilingual datasets in Catalan and Spanish
|
99 |
|
100 |
### Training procedure
|
101 |
|
|
|
15 |
name: Automatic Speech Recognition
|
16 |
type: automatic-speech-recognition
|
17 |
dataset:
|
18 |
+
name: CV Benchmark Catalan Accents
|
19 |
type: projecte-aina/commonvoice_benchmark_catalan_accents
|
20 |
config: ca
|
21 |
split: test
|
|
|
38 |
metrics:
|
39 |
- name: Test WER
|
40 |
type: wer
|
41 |
+
value: 3.88
|
42 |
---
|
43 |
# NVIDIA Conformer-Transducer Large (ca-es)
|
44 |
|
|
|
61 |
|
62 |
## Model Description
|
63 |
|
64 |
+
This model transcribes speech in lowercase Catalan and Spanish alphabet including spaces, and was Fine-tuned on a Bilingual ca-es dataset comprising of 7426 hours. It is a "large" variant of Conformer-Transducer, with around 120 million parameters.
|
65 |
See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#conformer-transducer) for complete architecture details.
|
66 |
|
67 |
## Intended Uses and Limitations
|
68 |
|
69 |
+
This model can be used for Automatic Speech Recognition (ASR) in Catalan and Spanish. It is intended to transcribe audio files in Catalan and Spanish to plain text without punctuation.
|
|
|
|
|
|
|
|
|
70 |
|
71 |
### Installation
|
72 |
|
|
|
76 |
```
|
77 |
|
78 |
### For Inference
|
79 |
+
To transcribe audio in Catalan or in Spanish language using this model, you can follow this example:
|
80 |
|
81 |
|
82 |
```python
|
|
|
91 |
|
92 |
### Training data
|
93 |
|
94 |
+
The model was trained on bilingual datasets in Catalan and Spanish, for a total of 7426 hours.
|
95 |
|
96 |
### Training procedure
|
97 |
|