Whisper Small GA-EN Speech Translation, fine-tuned from 1.2, without SpokenWords

This model is a fine-tuned version of openai/whisper-small on the IWSLT-2023, FLEURS, and BiteSize dataset. It achieves the following results on the evaluation set:

Loss: 1.7177
Bleu: 29.96
Chrf: 45.61
Wer: 66.9968

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 0.03
training_steps: 5000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Bleu	Chrf	Validation Loss	Wer
2.4954	0.11	100	3.7	18.03	2.1286	179.7839
2.045	0.22	200	12.65	25.53	1.8146	100.9005
1.7928	0.32	300	13.78	30.2	1.7253	101.9811
1.6615	0.43	400	15.8	31.88	1.6834	92.5259
1.4491	0.54	500	15.61	36.27	1.5971	107.3841
1.2074	0.65	600	19.92	36.31	1.5939	84.3314
1.2308	0.76	700	20.37	38.72	1.5234	84.8267
1.107	0.86	800	21.35	37.87	1.5460	82.8906
0.9491	0.97	900	21.06	40.74	1.5161	82.5754
0.384	1.08	1000	23.24	41.98	1.4927	82.2152
0.362	1.19	1100	23.19	42.24	1.5567	80.2792
0.3756	1.29	1200	27.83	43.8	1.5265	69.2481
0.3401	1.4	1300	21.79	41.66	1.5522	92.3908
0.3346	1.51	1400	24.61	42.15	1.5085	75.4615
0.3101	1.62	1500	26.67	43.41	1.4933	70.7789
0.3231	1.73	1600	27.95	42.82	1.4979	68.3026
0.2665	1.83	1700	28.5	43.76	1.4977	68.1225
0.2704	1.94	1800	28.15	43.87	1.5063	68.8429
0.0769	2.05	1900	25.76	43.22	1.5162	77.6227
0.0597	2.16	2000	25.04	43.15	1.5216	79.0635
0.0743	2.27	2100	27.85	44.43	1.5313	68.3926
0.0878	2.37	2200	27.54	43.96	1.5495	68.3476
0.0712	2.48	2300	28.28	44.39	1.5355	65.8712
0.0789	2.59	2400	28.64	44.75	1.5277	65.7812
0.073	2.7	2500	29.09	44.65	1.5327	65.7812
0.073	2.8	2600	25.26	43.44	1.5304	78.2981
0.0697	2.91	2700	25.71	43.02	1.5460	78.4782
0.0398	3.02	2800	28.26	44.71	1.5580	72.8501
0.0302	3.13	2900	30.25	45.46	1.5688	66.1414
0.0424	3.24	3000	29.88	45.21	1.5693	66.0964
0.0397	3.34	3100	30.01	45.85	1.5934	65.6911
0.0346	3.45	3200	30.2	45.8	1.5818	65.8262
0.032	3.56	3300	29.81	46.5	1.5823	66.7267
0.0348	3.67	3400	30.77	46.43	1.5752	64.6556
0.0522	5.97	3500	1.6080	29.69	45.47	65.8712
0.0443	6.14	3600	1.6272	29.54	44.71	65.1508
0.0492	6.31	3700	1.6211	29.3	45.36	68.3926
0.0544	6.48	3800	1.6069	30.08	44.39	64.9257
0.0574	6.66	3900	1.6306	28.86	44.6	66.2765
0.0535	6.83	4000	1.6722	27.92	43.48	67.9874
0.0424	7.0	4100	1.6968	27.48	44.29	70.3737
0.0235	7.17	4200	1.6768	27.97	45.34	70.0135
0.0262	7.34	4300	1.6908	28.77	45.74	68.3926
0.0218	7.51	4400	1.6890	28.97	46.57	69.5182
0.0293	7.68	4500	1.6742	29.51	45.38	68.8429
0.0194	7.85	4600	1.6962	29.63	45.18	67.9874
0.0187	8.02	4700	1.6936	30.1	45.28	66.0964
0.0115	8.19	4800	1.7162	30.0	46.02	67.6722
0.0138	8.36	4900	1.7113	30.34	46.01	66.6817
0.0098	8.53	5000	1.7177	29.96	45.61	66.9968

Framework versions

Transformers 4.39.3
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2

ymoslem
/

whisper-small-ga2en-v1.3

Whisper Small GA-EN Speech Translation, fine-tuned from 1.2, without SpokenWords

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ymoslem/whisper-small-ga2en-v1.3

Datasets used to train ymoslem/whisper-small-ga2en-v1.3

Evaluation results