elsayedissa's picture
Upload tokenizer
9492e2a verified
metadata
library_name: transformers
license: apache-2.0
base_model: openai/whisper-small
tags:
  - generated_from_trainer
datasets:
  - tachelhit_darija
metrics:
  - wer
model-index:
  - name: whisper-small-darija
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: tachelhit_darija
          type: tachelhit_darija
          config: default
          split: None
          args: default
        metrics:
          - type: wer
            value: 27.93522267206478
            name: Wer

whisper-small-darija

This model is a fine-tuned version of openai/whisper-small on the tachelhit_darija dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3828
  • Wer: 27.9352

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • training_steps: 1000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.572 1.4286 100 0.6403 60.7287
0.2156 2.8571 200 0.4233 42.7800
0.0459 4.2857 300 0.3953 48.1781
0.0257 5.7143 400 0.3663 31.0391
0.0089 7.1429 500 0.3857 31.9838
0.0029 8.5714 600 0.3748 30.3644
0.0026 10.0 700 0.3756 29.4197
0.0012 11.4286 800 0.3801 27.5304
0.0011 12.8571 900 0.3821 27.9352
0.0013 14.2857 1000 0.3828 27.9352

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0