git-base-www

This model is a fine-tuned version of microsoft/git-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 50
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer Score
3.6172	2.64	50	4.5571	0.9603
1.2455	5.2667	100	0.7764	0.9629
0.2488	7.9067	150	0.4500	1.2879
0.1652	10.5333	200	0.4539	2.7259
0.1242	13.16	250	0.4558	1.0871
0.0958	15.8	300	0.4580	2.2966
0.074	18.4267	350	0.4629	4.6957
0.0633	21.0533	400	0.4711	2.5060
0.0505	23.6933	450	0.4758	4.9060
0.0435	26.32	500	0.4796	4.9629
0.0383	28.96	550	0.4866	4.8526
0.0316	31.5867	600	0.4893	7.25
0.0284	34.2133	650	0.4927	4.5267
0.0261	36.8533	700	0.4954	7.6767
0.0228	39.48	750	0.4967	8.1853
0.0216	42.1067	800	0.4981	5.2336
0.0209	44.7467	850	0.4997	8.6957
0.0192	47.3733	900	0.5008	8.5552
0.0203	50.0	950	0.5009	8.6345