Mixtral_Alpace_v2

This model is a fine-tuned version of mistralai/Mixtral-8x7B-v0.1 on the generator dataset. It achieves the following results on the evaluation set:

Loss: 0.5881

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2.5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 15
num_epochs: 15

Training results

Training Loss	Epoch	Step	Validation Loss
1.5291	0.0870	10	1.6326
1.58	0.1739	20	1.5665
1.4109	0.2609	30	1.4856
1.4493	0.3478	40	1.4159
1.2503	0.4348	50	1.3493
1.2441	0.5217	60	1.2719
1.1923	0.6087	70	1.1930
1.1158	0.6957	80	1.1193
1.0184	0.7826	90	1.0541
1.0231	0.8696	100	1.0056
0.9731	0.9565	110	0.9619
0.892	1.0435	120	0.9170
0.911	1.1304	130	0.8727
0.7789	1.2174	140	0.8338
0.8049	1.3043	150	0.8041
0.7691	1.3913	160	0.7788
0.7869	1.4783	170	0.7589
0.7366	1.5652	180	0.7428
0.7436	1.6522	190	0.7282
0.7271	1.7391	200	0.7157
0.6809	1.8261	210	0.7056
0.7068	1.9130	220	0.6960
0.6446	2.0	230	0.6872
0.6682	2.0870	240	0.6819
0.7003	2.1739	250	0.6745
0.6859	2.2609	260	0.6701
0.6169	2.3478	270	0.6655
0.666	2.4348	280	0.6607
0.6325	2.5217	290	0.6575
0.6408	2.6087	300	0.6536
0.6371	2.6957	310	0.6507
0.5933	2.7826	320	0.6474
0.6313	2.8696	330	0.6450
0.6453	2.9565	340	0.6421
0.6807	3.0435	350	0.6407
0.6217	3.1304	360	0.6390
0.589	3.2174	370	0.6355
0.5591	3.3043	380	0.6337
0.6818	3.3913	390	0.6319
0.6269	3.4783	400	0.6306
0.611	3.5652	410	0.6286
0.5602	3.6522	420	0.6268
0.6735	3.7391	430	0.6251
0.5269	3.8261	440	0.6246
0.6109	3.9130	450	0.6232
0.5745	4.0	460	0.6221
0.6348	4.0870	470	0.6227
0.5398	4.1739	480	0.6203
0.6145	4.2609	490	0.6194
0.621	4.3478	500	0.6178
0.6123	4.4348	510	0.6172
0.6113	4.5217	520	0.6162
0.5991	4.6087	530	0.6154
0.5244	4.6957	540	0.6143
0.5832	4.7826	550	0.6136
0.6284	4.8696	560	0.6120
0.54	4.9565	570	0.6121
0.541	5.0435	580	0.6120
0.5204	5.1304	590	0.6108
0.5961	5.2174	600	0.6101
0.5522	5.3043	610	0.6098
0.5778	5.3913	620	0.6087
0.6059	5.4783	630	0.6090
0.5852	5.5652	640	0.6085
0.5687	5.6522	650	0.6072
0.5685	5.7391	660	0.6061
0.593	5.8261	670	0.6052
0.5975	5.9130	680	0.6055
0.5489	6.0	690	0.6047
0.567	6.0870	700	0.6049
0.5706	6.1739	710	0.6035
0.658	6.2609	720	0.6024
0.559	6.3478	730	0.6023
0.545	6.4348	740	0.6019
0.6096	6.5217	750	0.6021
0.5385	6.6087	760	0.6018
0.5505	6.6957	770	0.6012
0.5058	6.7826	780	0.6003
0.5899	6.8696	790	0.5999
0.5102	6.9565	800	0.5995
0.5185	7.0435	810	0.5995
0.5055	7.1304	820	0.5991
0.5907	7.2174	830	0.5997
0.5636	7.3043	840	0.5991
0.5505	7.3913	850	0.5986
0.5621	7.4783	860	0.5977
0.4968	7.5652	870	0.5976
0.5713	7.6522	880	0.5970
0.5968	7.7391	890	0.5970
0.531	7.8261	900	0.5964
0.538	7.9130	910	0.5959
0.6087	8.0	920	0.5959
0.5845	8.0870	930	0.5963
0.5197	8.1739	940	0.5960
0.5128	8.2609	950	0.5959
0.5613	8.3478	960	0.5956
0.5268	8.4348	970	0.5953
0.5696	8.5217	980	0.5952
0.5755	8.6087	990	0.5941
0.5014	8.6957	1000	0.5945
0.5568	8.7826	1010	0.5936
0.5934	8.8696	1020	0.5944
0.5178	8.9565	1030	0.5941
0.4618	9.0435	1040	0.5936
0.4867	9.1304	1050	0.5934
0.5402	9.2174	1060	0.5937
0.5177	9.3043	1070	0.5936
0.5825	9.3913	1080	0.5926
0.5523	9.4783	1090	0.5929
0.583	9.5652	1100	0.5920
0.5232	9.6522	1110	0.5927
0.5367	9.7391	1120	0.5920
0.5321	9.8261	1130	0.5913
0.5672	9.9130	1140	0.5910
0.5549	10.0	1150	0.5910
0.5191	10.0870	1160	0.5915
0.5463	10.1739	1170	0.5915
0.5275	10.2609	1180	0.5913
0.5484	10.3478	1190	0.5915
0.5293	10.4348	1200	0.5910
0.519	10.5217	1210	0.5903
0.5129	10.6087	1220	0.5898
0.5793	10.6957	1230	0.5900
0.4481	10.7826	1240	0.5901
0.5309	10.8696	1250	0.5903
0.5887	10.9565	1260	0.5898
0.5109	11.0435	1270	0.5907
0.5776	11.1304	1280	0.5902
0.4984	11.2174	1290	0.5898
0.5656	11.3043	1300	0.5898
0.4931	11.3913	1310	0.5902
0.531	11.4783	1320	0.5900
0.5163	11.5652	1330	0.5892
0.5314	11.6522	1340	0.5894
0.4766	11.7391	1350	0.5893
0.5201	11.8261	1360	0.5896
0.6127	11.9130	1370	0.5889
0.5441	12.0	1380	0.5888
0.5258	12.0870	1390	0.5894
0.5722	12.1739	1400	0.5887
0.5228	12.2609	1410	0.5891
0.524	12.3478	1420	0.5884
0.4951	12.4348	1430	0.5894
0.5235	12.5217	1440	0.5893
0.5071	12.6087	1450	0.5889
0.5417	12.6957	1460	0.5886
0.4882	12.7826	1470	0.5889
0.548	12.8696	1480	0.5889
0.529	12.9565	1490	0.5889
0.5646	13.0435	1500	0.5887
0.5142	13.1304	1510	0.5889
0.5161	13.2174	1520	0.5886
0.5008	13.3043	1530	0.5888
0.5187	13.3913	1540	0.5887
0.5334	13.4783	1550	0.5886
0.5099	13.5652	1560	0.5884
0.5644	13.6522	1570	0.5888
0.5242	13.7391	1580	0.5882
0.4912	13.8261	1590	0.5886
0.5459	13.9130	1600	0.5884
0.5204	14.0	1610	0.5881
0.4644	14.0870	1620	0.5884
0.5364	14.1739	1630	0.5885
0.5852	14.2609	1640	0.5887
0.5135	14.3478	1650	0.5884
0.5192	14.4348	1660	0.5885
0.5093	14.5217	1670	0.5880
0.5398	14.6087	1680	0.5884
0.469	14.6957	1690	0.5882
0.5163	14.7826	1700	0.5883
0.5165	14.8696	1710	0.5883
0.5441	14.9565	1720	0.5881

Framework versions

PEFT 0.12.0
Transformers 4.44.0
Pytorch 2.4.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Cem13
/

mixtral_semptom_1

Mixtral_Alpace_v2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Cem13/mixtral_semptom_1

Evaluation results