projecte-aina
/

stt_ca-es_conformer_transducer_large

Automatic Speech Recognition

Model card Files Files and versions Community

AbirMessaoudi commited on Dec 12, 2024

Commit

03e9fcc

·

verified ·

1 Parent(s): 82aafa0

Update README.md

Files changed (1) hide show

README.md +6 -10

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ model-index:
       name: Automatic Speech Recognition
       type: automatic-speech-recognition
     dataset:
-      name: CV Benchmark Catalan Accents
       type: projecte-aina/commonvoice_benchmark_catalan_accents
       config: ca
       split: test
@@ -38,7 +38,7 @@ model-index:
     metrics:
     - name: Test WER
       type: wer
-      value: 3.880
 ---
 # NVIDIA Conformer-Transducer Large (ca-es)
@@ -61,16 +61,12 @@ The "stt_ca-es_conformer_transducer_large" is an acoustic model based on ["NVIDI
 ## Model Description
-This model transcribes speech in lowercase Catalan and Spanish alphabet including spaces, and was Fine-tuned on a Bilingual ca-es dataset comprising of xx hours. It is a "large" variant of Conformer-Transducer, with around 120 million parameters.
 See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#conformer-transducer) for complete architecture details.
 ## Intended Uses and Limitations
-This model can used for Automatic Speech Recognition (ASR) in Catalan and Spanish. The model is intended to transcribe audio files in Catalan and Spanish to plain text without punctuation.
-## How to Get Started with the Model
-To see an updated and functional version of this code, please check our [Notebook](insert notebook link)
 ### Installation
@@ -80,7 +76,7 @@ pip install nemo_toolkit['all']
 ```
 ### For Inference
-To transcribe audio in Catalan and Spanish using this model, you can follow this example:
 ```python
@@ -95,7 +91,7 @@ print(transcription)
 ### Training data
-The model was trained on bilingual datasets in Catalan and Spanish. The total number of hours is xx.
 ### Training procedure

       name: Automatic Speech Recognition
       type: automatic-speech-recognition
     dataset:
+      name: CV Benchmark Catalan Accents
       type: projecte-aina/commonvoice_benchmark_catalan_accents
       config: ca
       split: test
     metrics:
     - name: Test WER
       type: wer
+      value: 3.88
 ---
 # NVIDIA Conformer-Transducer Large (ca-es)
 ## Model Description
+This model transcribes speech in lowercase Catalan and Spanish alphabet including spaces, and was Fine-tuned on a Bilingual ca-es dataset comprising of 7426 hours. It is a "large" variant of Conformer-Transducer, with around 120 million parameters.
 See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#conformer-transducer) for complete architecture details.
 ## Intended Uses and Limitations
+This model can be used for Automatic Speech Recognition (ASR) in Catalan and Spanish. It is intended to transcribe audio files in Catalan and Spanish to plain text without punctuation.
 ### Installation
 ```
 ### For Inference
+To transcribe audio in Catalan or in Spanish language using this model, you can follow this example:
 ```python
 ### Training data
+The model was trained on bilingual datasets in Catalan and Spanish, for a total of 7426 hours.
 ### Training procedure