vkunchur19's picture
Model save
e31bb2b verified
metadata
library_name: transformers
license: apache-2.0
base_model: openai/whisper-medium
tags:
  - generated_from_trainer
datasets:
  - audiofolder
metrics:
  - wer
model-index:
  - name: whisper-medium-konnakol-rests-0.2
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: audiofolder
          type: audiofolder
          config: default
          split: test
          args: default
        metrics:
          - name: Wer
            type: wer
            value: 32.26744186046512

whisper-medium-konnakol-rests-0.2

This model is a fine-tuned version of openai/whisper-medium on the audiofolder dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1067
  • Wer: 32.2674

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 2
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • training_steps: 300
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.7473 16.5333 50 0.0780 64.8256
0.0143 33.2667 100 0.0719 49.7093
0.0079 49.8 150 0.0784 54.6512
0.0018 66.5333 200 0.1028 39.8256
0.0003 83.2667 250 0.1079 32.2674
0.0004 99.8 300 0.1067 32.2674

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.1