Ministral-8B-Instruct-2410-JEP

This model is a fine-tuned version of mistralai/Ministral-8B-Instruct-2410 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.1977

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 1
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 4
optimizer: Use paged_adamw_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
1.3744	0.1535	100	1.3521
1.287	0.3070	200	1.2976
1.2346	0.4605	300	1.2699
1.2384	0.6140	400	1.2527
1.2937	0.7675	500	1.2421
1.2046	0.9210	600	1.2340
1.1915	1.0737	700	1.2277
1.2159	1.2272	800	1.2253
1.1631	1.3807	900	1.2206
1.1935	1.5342	1000	1.2162
1.1701	1.6876	1100	1.2129
1.1925	1.8411	1200	1.2067
1.2215	1.9946	1300	1.2037
1.1858	2.1474	1400	1.2032
1.1737	2.3008	1500	1.2008
1.1751	2.4543	1600	1.1988
1.1514	2.6078	1700	1.1957
1.1327	2.7613	1800	1.1930
1.1266	2.9148	1900	1.1906
1.0929	3.0675	2000	1.1909
1.1054	3.2210	2100	1.1913
1.1097	3.3745	2200	1.1896
1.2006	3.5280	2300	1.1869
1.1605	3.6815	2400	1.1839
1.1155	3.8350	2500	1.1844
1.1481	3.9885	2600	1.1836
1.1011	4.1412	2700	1.1878
1.0627	4.2947	2800	1.1897
1.1387	4.4482	2900	1.1863
1.0656	4.6017	3000	1.1826
1.0951	4.7552	3100	1.1837
1.0806	4.9087	3200	1.1795
1.0508	5.0614	3300	1.1830
1.1051	5.2149	3400	1.1876
1.0061	5.3684	3500	1.1894
1.1471	5.5219	3600	1.1811
1.1143	5.6754	3700	1.1833
1.1146	5.8289	3800	1.1823
1.0648	5.9823	3900	1.1837
1.062	6.1351	4000	1.1903
1.065	6.2886	4100	1.1877
1.0379	6.4421	4200	1.1875
1.0188	6.5955	4300	1.1873
1.0332	6.7490	4400	1.1850
1.026	6.9025	4500	1.1854
1.0365	7.0553	4600	1.1897
1.0359	7.2087	4700	1.1928
1.0483	7.3622	4800	1.1921
0.9988	7.5157	4900	1.1914
1.0348	7.6692	5000	1.1893
0.9884	7.8227	5100	1.1879
1.0903	7.9762	5200	1.1890
0.9946	8.1289	5300	1.1942
1.0328	8.2824	5400	1.1941
1.0031	8.4359	5500	1.1949
0.9096	8.5894	5600	1.1946
1.018	8.7429	5700	1.1939
1.0533	8.8964	5800	1.1920
0.9476	9.0491	5900	1.1967
0.9817	9.2026	6000	1.1989
0.9774	9.3561	6100	1.1987
1.0092	9.5096	6200	1.1974
1.0067	9.6631	6300	1.1977
1.0243	9.8166	6400	1.1983
0.9359	9.9701	6500	1.1977

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.6.0+cu126
Datasets 3.5.0
Tokenizers 0.21.1

raulgdp
/

Ministral-8B-Instruct-2410-JEP

Ministral-8B-Instruct-2410-JEP

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for raulgdp/Ministral-8B-Instruct-2410-JEP

Evaluation results