File size: 5,116 Bytes
fcfe747
 
 
 
 
 
 
 
 
6c043d7
fcfe747
 
 
 
 
 
 
 
 
 
 
0c591d9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fcfe747
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c3cf1d6
fcfe747
 
 
 
 
 
c3cf1d6
 
 
 
 
47904cd
389d8ee
fcfe747
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
language:
- 'no'
license: apache-2.0
tags:
- audio
- asr
- automatic-speech-recognition
- hf-asr-leaderboard
base_model: openai/whisper-medium
model-index:
- name: scream_medium_beta
  results: []
---

<!-- This model card has been generated automatically according to the information Keras had access to. You should
probably proofread and complete it, then remove this comment. -->

# scream_medium_beta

This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on the NbAiLab/ncc_speech dataset.
It achieves the following results on the evaluation set:
- step: 24999
- validation_fleurs_loss: 1.4171
- train_loss: 0.5400
- validation_fleurs_wer: 8.8638
- validation_fleurs_cer: 3.8370
- validation_fleurs_exact_wer: 14.1278
- validation_fleurs_exact_cer: 5.1993
- validation_stortinget_loss: 0.3369
- validation_stortinget_wer: 14.2120
- validation_stortinget_cer: 10.2972
- validation_stortinget_exact_wer: 17.4640
- validation_stortinget_exact_cer: 10.8352
- validation_nrk_tv_loss: 0.8259
- validation_nrk_tv_wer: 39.9035
- validation_nrk_tv_cer: 31.1762
- validation_nrk_tv_exact_wer: 47.4289
- validation_nrk_tv_exact_cer: 32.3674

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2.5e-05
- lr_scheduler_type: linear
- per_device_train_batch_size: 16
- total_train_batch_size_per_node: 64
- total_train_batch_size: 1024
- total_optimization_steps: 25,000
- starting_optimization_step: None
- finishing_optimization_step: 25,000
- num_train_dataset_workers: 32
- num_hosts: 16
- total_num_training_examples: 25,600,000
- steps_per_epoch: 6271
- num_beams: None
- dropout: True
- bpe_dropout_probability: 0.1

### Training results

| step  | validation_fleurs_loss | train_loss | validation_fleurs_wer | validation_fleurs_cer | validation_fleurs_exact_wer | validation_fleurs_exact_cer | validation_stortinget_loss | validation_stortinget_wer | validation_stortinget_cer | validation_stortinget_exact_wer | validation_stortinget_exact_cer | validation_nrk_tv_loss | validation_nrk_tv_wer | validation_nrk_tv_cer | validation_nrk_tv_exact_wer | validation_nrk_tv_exact_cer |
|:-----:|:----------------------:|:----------:|:---------------------:|:---------------------:|:---------------------------:|:---------------------------:|:--------------------------:|:-------------------------:|:-------------------------:|:-------------------------------:|:-------------------------------:|:----------------------:|:---------------------:|:---------------------:|:---------------------------:|:---------------------------:|
| 0     | 3.6595                 | 2.4764     | 17.4301               | 5.4794                | 21.6249                     | 6.3977                      | 1.3465                     | 33.9515                   | 19.1377                   | 38.4072                         | 20.3275                         | 1.8386                 | 66.2133               | 48.0904               | 75.6490                     | 49.8313                     |
| 5000  | 1.2828                 | 0.6841     | 8.8638                | 3.8864                | 13.2318                     | 4.9529                      | 0.3311                     | 14.5798                   | 10.4664                   | 17.7786                         | 11.0119                         | 0.8229                 | 41.0824               | 31.7759               | 48.7519                     | 33.0088                     |
| 10000 | 1.1134                 | 0.6019     | 8.3284                | 3.6990                | 13.1123                     | 4.9287                      | 0.3132                     | 14.0485                   | 10.1394                   | 17.2099                         | 10.6785                         | 0.7856                 | 39.0957               | 30.3740               | 46.7798                     | 31.5896                     |
| 15000 | 1.1605                 | 0.5821     | 8.5068                | 3.7631                | 13.5603                     | 5.0157                      | 0.3181                     | 13.7633                   | 10.0236                   | 16.9465                         | 10.5585                         | 0.7864                 | 39.4419               | 30.9142               | 46.7507                     | 32.1012                     |
| 20000 | 1.0986                 | 0.5395     | 8.7448                | 3.9456                | 14.4863                     | 5.3733                      | 0.3226                     | 14.2469                   | 10.3402                   | 17.4640                         | 10.8776                         | 0.7884                 | 39.8129               | 31.0325               | 47.3332                     | 32.2280                     |

### Framework versions

- Transformers 4.31.0.dev0
- Datasets 2.13.0
- Tokenizers 0.13.3