metadata
language:
- 'no'
license: apache-2.0
tags:
- audio
- asr
- automatic-speech-recognition
- hf-asr-leaderboard
model-index:
- name: scream_medium_beta
results: []
scream_medium_beta
This model is a fine-tuned version of openai/whisper-medium on the NbAiLab/ncc_speech dataset.
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2.5e-05
- lr_scheduler_type: linear
- per_device_train_batch_size: 16
- total_train_batch_size_per_node: 64
- total_train_batch_size: 1024
- total_optimization_steps: 25,000
- starting_optimization_step: None
- finishing_optimization_step: 25,000
- num_train_dataset_workers: 32
- num_hosts: 16
- total_num_training_examples: 25,600,000
- steps_per_epoch: To be computed after first epoch
- num_beams: None
- dropout: True
- bpe_dropout_probability: 0.1
Training results
step | validation_fleurs_loss | train_loss | validation_fleurs_wer | validation_fleurs_cer | validation_fleurs_exact_wer | validation_fleurs_exact_cer | validation_stortinget_loss | validation_stortinget_wer | validation_stortinget_cer | validation_stortinget_exact_wer | validation_stortinget_exact_cer | validation_nrk_tv_loss | validation_nrk_tv_wer | validation_nrk_tv_cer | validation_nrk_tv_exact_wer | validation_nrk_tv_exact_cer |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3.6595 | 2.4764 | 17.4301 | 5.4794 | 21.6249 | 6.3977 | 1.3465 | 33.9515 | 19.1377 | 38.4072 | 20.3275 | 1.8386 | 66.2133 | 48.0904 | 75.6490 | 49.8313 |
Framework versions
- Transformers 4.31.0.dev0
- Datasets 2.13.0
- Tokenizers 0.13.3