test-trainer

This model is a fine-tuned version of bert-base-uncased on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
training_steps: 1377

Training Loss	Epoch	Step	Validation Loss
No log	0.1089	50	0.5885
0.6475	0.2179	100	0.5992
0.6475	0.3268	150	0.5555
0.5984	0.4357	200	0.5674
0.5984	0.5447	250	0.8102
0.574	0.6536	300	0.5246
0.574	0.7625	350	0.5154
0.5728	0.8715	400	0.5616
0.5728	0.9804	450	0.5247
0.4933	1.0893	500	0.4771
0.4933	1.1983	550	0.5082
0.4332	1.3072	600	0.4866
0.4332	1.4161	650	0.4762
0.4269	1.5251	700	0.3891
0.4269	1.6340	750	0.4092
0.3825	1.7429	800	0.4439
0.3825	1.8519	850	0.3988
0.4038	1.9608	900	0.4035
0.4038	2.0697	950	0.5283
0.2891	2.1786	1000	0.5314
0.2891	2.2876	1050	0.5842
0.2558	2.3965	1100	0.5879
0.2558	2.5054	1150	0.5792
0.2529	2.6144	1200	0.5626
0.2529	2.7233	1250	0.5591
0.2729	2.8322	1300	0.5504
0.2729	2.9412	1350	0.5319

Safetensors

Model size

0.1B params

Tensor type

F32

Base model

Finetuned

this model