projecte-aina
/

parakeet-rnnt-1.1b_cv17_es_ep18_1270h

@@ -111,7 +111,7 @@ print(output[0].text)
 ### Training data
-The specific dataset used to create the model is called ["cv17_es_other_automatically_verified"](https://huggingface.co/datasets/projecte-aina/cv17_es_other_automatically_verified).
 ### Training procedure
@@ -124,6 +124,7 @@ This model is the result of finetuning the model ["parakeet-rnnt-1.1b"](https://
 * learning rate: 2e-4
 * devices=4
 * num_nodes=8
 * accelerator=accelerator
 * strategy="ddp"
 * max_epochs=50

 ### Training data
+The specific datasets used to create the model are the ["cv17_es_other_automatically_verified"](https://huggingface.co/datasets/projecte-aina/cv17_es_other_automatically_verified) (784 hours and 50 minutes) in combination with around 485 hours of Spanish data taken from the split called "validated" of [Mozilla Common Voice 17.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_17_0)
 ### Training procedure
 * learning rate: 2e-4
 * devices=4
 * num_nodes=8
+* batch_size=8
 * accelerator=accelerator
 * strategy="ddp"
 * max_epochs=50