roberta-base-anion-1e-06-256

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-06
train_batch_size: 256
eval_batch_size: 256
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 30

Training Loss	Epoch	Step	Validation Loss
0.6931	1.0	358	0.5325
0.5146	2.0	716	0.4410
0.4486	3.0	1074	0.4091
0.4173	4.0	1432	0.3935
0.4088	5.0	1790	0.3771
0.3865	6.0	2148	0.3675
0.375	7.0	2506	0.3563
0.3622	8.0	2864	0.3519
0.3546	9.0	3222	0.3412
0.3462	10.0	3580	0.3411
0.3407	11.0	3938	0.3398
0.3323	12.0	4296	0.3319
0.3254	13.0	4654	0.3299
0.3208	14.0	5012	0.3275
0.3147	15.0	5370	0.3243
0.3066	16.0	5728	0.3237
0.3084	17.0	6086	0.3219
0.3018	18.0	6444	0.3199
0.3042	19.0	6802	0.3203
0.2987	20.0	7160	0.3192
0.2914	21.0	7518	0.3191
0.2931	22.0	7876	0.3180
0.2902	23.0	8234	0.3167
0.2868	24.0	8592	0.3166
0.2841	25.0	8950	0.3141
0.2841	26.0	9308	0.3142
0.2831	27.0	9666	0.3142
0.2842	28.0	10024	0.3140
0.2819	29.0	10382	0.3142
0.2811	30.0	10740	0.3141