Whisper-small-ru-v2

This model is a fine-tuned version of openai/whisper-small on an Russian part of the Common Voice 15 dataset. It achieves the following results on the evaluation set:

Loss: 0.1329
Wer: 12.6750
Cer: 3.7305
Learning Rate: 0.0000

Model description

Same as openai/whisper-small.

Intended uses & limitations

Same as openai/whisper-small

Training and evaluation data

Fine-tunned on an Russian part of the Common Voice 15 dataset.

Training procedure

According to the article "Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers"

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-08
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 250
training_steps: 15000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.0661	0.09	500	0.1358	12.9097	3.8217
0.0616	0.17	1000	0.1357	12.9620	3.8949
0.0601	0.26	1500	0.1357	12.8795	3.8225
0.0666	0.35	2000	0.1353	12.9481	3.8871
0.0669	0.43	2500	0.1352	12.8284	3.8283
0.0665	0.52	3000	0.1351	12.8203	3.7833
0.0649	0.61	3500	0.1349	12.8098	3.7824
0.0607	0.69	4000	0.1347	12.8110	3.8105
0.0636	0.78	4500	0.1345	12.7994	3.7893
0.063	0.87	5000	0.1342	12.8319	3.8084
0.0589	0.95	5500	0.1341	12.8807	3.8551
0.0734	1.04	6000	0.1341	12.7691	3.7604
0.0577	1.13	6500	0.1340	12.7645	3.7602
0.052	1.21	7000	0.1340	12.7610	3.7655
0.0626	1.3	7500	0.1339	12.7657	3.7593
0.0617	1.39	8000	0.1338	12.7912	3.8268
0.063	1.47	8500	0.1337	12.7343	3.7573
0.0668	1.56	9000	0.1336	12.7308	3.7198
0.0634	1.65	9500	0.1335	12.7215	3.7400
0.0604	1.73	10000	0.1333	12.7192	3.7515
0.0707	1.82	10500	0.1333	12.7052	3.7568
0.0639	1.91	11000	0.1332	12.6983	3.7617
0.0617	1.99	11500	0.1331	12.6936	3.7402
0.0601	2.08	12000	0.1330	12.6901	3.7586
0.0632	2.17	12500	0.1330	12.6785	3.7279
0.0626	2.25	13000	0.1330	12.6808	3.7333
0.066	2.34	13500	0.1329	12.6704	3.7512
0.0674	2.42	14000	0.1329	12.6599	3.7384
0.0637	2.51	14500	0.1329	12.6797	3.7428
0.0641	2.6	15000	0.1329	12.6750	3.7305

Framework versions

Transformers 4.36.0.dev0
Pytorch 2.1.1+cu121
Datasets 2.15.0
Tokenizers 0.15.0

Downloads last month: 39

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for artyomboyko/whisper-small-ru-v2

Base model

openai/whisper-small

Finetuned

(3163)

this model

Evaluation results

Test WER on Common Voice 15
self-reported

12.675
Test CER on Common Voice 15
self-reported

3.731