hubert-large-timit-upsample-decoder

This model is a fine-tuned version of facebook/hubert-large-ls960-ft on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 40
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
83.048	2.4752	500	48.3229	0.9459
2.1777	4.9505	1000	3.3841	0.9775
2.5735	7.4257	1500	2.1042	0.9698
5.3683	9.9010	2000	1.5244	0.9699
1.906	12.3762	2500	1.3064	0.9128
1.9468	14.8515	3000	1.3597	0.9174
1.6598	17.3267	3500	1.1801	0.9093
1.2808	19.8020	4000	1.6481	0.9181
2.0953	22.2772	4500	3.1021	0.9602
0.5282	24.7525	5000	0.5278	0.9755
7.1607	27.2277	5500	0.9557	0.9823
4.1975	29.7030	6000	13.0365	0.9301
0.5248	32.1782	6500	0.5075	0.9840
0.5065	34.6535	7000	0.5001	0.9834
0.4997	37.1287	7500	0.5032	0.9793
0.5072	39.6040	8000	0.4975	0.9749