fine_tuned_roct_callback10

This model is a fine-tuned version of Qwen/Qwen2-1.5B on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 3

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.6746	0.0309	100	1.1656	0.8788
0.5117	0.0617	200	0.3858	0.8788
0.3739	0.0926	300	1.0712	0.8788
0.4961	0.1235	400	0.3711	0.8274
0.2688	0.1543	500	0.1956	0.8965
0.2633	0.1852	600	0.2161	0.9
0.3015	0.2160	700	0.2866	0.9073
0.2304	0.2469	800	0.2740	0.9017
0.2114	0.2778	900	0.2462	0.9191
0.2807	0.3086	1000	0.2409	0.9125
0.2323	0.3395	1100	0.3777	0.8837
0.2514	0.3704	1200	0.2062	0.9281
0.2278	0.4012	1300	0.1762	0.9351
0.2099	0.4321	1400	0.1856	0.9247
0.2004	0.4630	1500	0.2237	0.9313
0.2177	0.4938	1600	0.1715	0.9313
0.3046	0.5247	1700	0.1545	0.9434
0.2179	0.5556	1800	0.1713	0.9472
0.1665	0.5864	1900	0.1142	0.9549
0.2066	0.6173	2000	0.1424	0.9563
0.1908	0.6481	2100	0.1284	0.9635
0.145	0.6790	2200	0.1550	0.9618
0.147	0.7099	2300	0.7114	0.8826
0.1634	0.7407	2400	0.1536	0.9625
0.1184	0.7716	2500	0.2507	0.9458
0.1771	0.8025	2600	0.1449	0.9583
0.1399	0.8333	2700	0.2384	0.9347
0.1709	0.8642	2800	0.1296	0.9542
0.1545	0.8951	2900	0.1842	0.9587