alakxender
/

whisper-large-dv-a40

Automatic Speech Recognition

Generated from Trainer

Model card Files Files and versions

This model is a fine-tuned version of openai/whisper-large-v3 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.0252
Wer: 2.0163
Wer Ortho: 15.2648

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_steps: 50
training_steps: 4000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Wer Ortho
0.0426	0.0354	500	0.0501	3.6048	24.5780
0.0293	0.0709	1000	0.0367	2.6889	21.3792
0.0251	0.1063	1500	0.0317	2.3869	17.6751
0.0244	0.1418	2000	0.0296	2.2782	16.7890
0.0209	0.1772	2500	0.0284	2.2486	16.2831
0.0205	0.2126	3000	0.0254	1.9749	14.9776
0.0234	0.2481	3500	0.0261	2.1892	15.1784
0.0229	0.2835	4000	0.0252	2.0163	15.2648

Framework versions

Transformers 4.41.0.dev0
Pytorch 2.3.0+cu121
Datasets 2.19.0
Tokenizers 0.19.1

Downloads last month: 10

Safetensors

Model size

1.54B params

Tensor type

F32

·

Inference Providers NEW

Automatic Speech Recognition

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for alakxender/whisper-large-dv-a40

Base model

openai/whisper-large-v3

Finetuned

(560)

this model

Collection including alakxender/whisper-large-dv-a40

Audio

Dhivehi Voice AI Collection: Tools for Thaana speech recognition (ASR), text-to-speech (TTS), and audio processing • 26 items • Updated Apr 24

Evaluation results

Metadata error: specify a dataset to view leaderboard