rationale_model_e10

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 3.0

Training Loss	Epoch	Step	Validation Loss
2.0662	0.0477	500	1.9416
1.8844	0.0954	1000	1.9136
1.7819	0.1431	1500	1.9041
1.6587	0.1908	2000	1.9142
1.5711	0.2385	2500	1.9290
1.4686	0.2862	3000	1.9362
1.3787	0.3338	3500	2.0431
1.2464	0.3815	4000	2.0219
1.1407	0.4292	4500	2.0494
1.0591	0.4769	5000	2.0871
0.9351	0.5246	5500	2.1374
0.8295	0.5723	6000	2.1954
0.7724	0.6200	6500	2.2344
0.6506	0.6677	7000	2.2971
0.6109	0.7154	7500	2.3390
0.5302	0.7631	8000	2.4308
0.4378	0.8108	8500	2.5308
0.383	0.8585	9000	2.6438
0.3419	0.9061	9500	2.6942
0.2983	0.9538	10000	2.7862
0.2568	1.0015	10500	2.9069
0.186	1.0492	11000	2.8744
0.1799	1.0969	11500	2.9436
0.1831	1.1446	12000	2.9253
0.1751	1.1923	12500	3.0272
0.1652	1.2400	13000	3.0354
0.1644	1.2877	13500	3.0101
0.1569	1.3354	14000	3.0530
0.1554	1.3831	14500	3.0933
0.1498	1.4308	15000	3.1092
0.1424	1.4784	15500	3.1997
0.1417	1.5261	16000	3.1469
0.1385	1.5738	16500	3.2502
0.1355	1.6215	17000	3.2343
0.1323	1.6692	17500	3.2179
0.1279	1.7169	18000	3.2491
0.1268	1.7646	18500	3.2739
0.1206	1.8123	19000	3.3483
0.1211	1.8600	19500	3.3606
0.118	1.9077	20000	3.3723
0.1162	1.9554	20500	3.3527
0.1124	2.0031	21000	3.5134
0.0983	2.0507	21500	3.4884
0.1002	2.0984	22000	3.5197
0.1018	2.1461	22500	3.5413
0.0981	2.1938	23000	3.5697
0.097	2.2415	23500	3.5927
0.0949	2.2892	24000	3.5983
0.0971	2.3369	24500	3.6530
0.0952	2.3846	25000	3.6665
0.0973	2.4323	25500	3.6585
0.0915	2.4800	26000	3.7384
0.0918	2.5277	26500	3.7284
0.0918	2.5754	27000	3.7835
0.0885	2.6230	27500	3.8170
0.0891	2.6707	28000	3.8412
0.0901	2.7184	28500	3.8526
0.0878	2.7661	29000	3.8645
0.0864	2.8138	29500	3.9049
0.0866	2.8615	30000	3.9255
0.0853	2.9092	30500	3.9378
0.0858	2.9569	31000	3.9455