Mixtral_Alpace_v2

This model is a fine-tuned version of mistralai/Mixtral-8x7B-v0.1 on the generator dataset. It achieves the following results on the evaluation set:

Loss: 0.3154

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2.5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 15
num_epochs: 5

Training results

Training Loss	Epoch	Step	Validation Loss
0.3573	0.0327	10	0.3448
0.3569	0.0654	20	0.3446
0.365	0.0980	30	0.3439
0.341	0.1307	40	0.3437
0.3101	0.1634	50	0.3428
0.3538	0.1961	60	0.3419
0.32	0.2288	70	0.3414
0.3361	0.2614	80	0.3403
0.3211	0.2941	90	0.3395
0.3583	0.3268	100	0.3386
0.3174	0.3595	110	0.3382
0.3097	0.3922	120	0.3378
0.33	0.4248	130	0.3374
0.3159	0.4575	140	0.3368
0.3636	0.4902	150	0.3366
0.334	0.5229	160	0.3356
0.348	0.5556	170	0.3353
0.3296	0.5882	180	0.3350
0.3498	0.6209	190	0.3338
0.3461	0.6536	200	0.3337
0.3378	0.6863	210	0.3335
0.3114	0.7190	220	0.3327
0.3291	0.7516	230	0.3324
0.3189	0.7843	240	0.3320
0.3214	0.8170	250	0.3311
0.3117	0.8497	260	0.3309
0.3025	0.8824	270	0.3310
0.2679	0.9150	280	0.3306
0.3592	0.9477	290	0.3304
0.3097	0.9804	300	0.3296
0.3662	1.0131	310	0.3295
0.2969	1.0458	320	0.3292
0.3109	1.0784	330	0.3290
0.3369	1.1111	340	0.3287
0.3101	1.1438	350	0.3287
0.3264	1.1765	360	0.3283
0.3328	1.2092	370	0.3278
0.3234	1.2418	380	0.3276
0.301	1.2745	390	0.3278
0.3357	1.3072	400	0.3273
0.3058	1.3399	410	0.3271
0.3204	1.3725	420	0.3266
0.3393	1.4052	430	0.3265
0.288	1.4379	440	0.3265
0.3121	1.4706	450	0.3259
0.301	1.5033	460	0.3255
0.2912	1.5359	470	0.3254
0.3426	1.5686	480	0.3253
0.3256	1.6013	490	0.3254
0.291	1.6340	500	0.3253
0.3234	1.6667	510	0.3249
0.3024	1.6993	520	0.3242
0.3628	1.7320	530	0.3240
0.331	1.7647	540	0.3234
0.321	1.7974	550	0.3235
0.2981	1.8301	560	0.3230
0.3369	1.8627	570	0.3233
0.3033	1.8954	580	0.3227
0.3578	1.9281	590	0.3224
0.2838	1.9608	600	0.3224
0.3026	1.9935	610	0.3221
0.2858	2.0261	620	0.3228
0.3001	2.0588	630	0.3225
0.2974	2.0915	640	0.3219
0.3071	2.1242	650	0.3217
0.3216	2.1569	660	0.3217
0.3056	2.1895	670	0.3216
0.3392	2.2222	680	0.3215
0.314	2.2549	690	0.3214
0.3243	2.2876	700	0.3210
0.3232	2.3203	710	0.3213
0.3365	2.3529	720	0.3211
0.3163	2.3856	730	0.3212
0.3086	2.4183	740	0.3211
0.3048	2.4510	750	0.3207
0.299	2.4837	760	0.3203
0.3203	2.5163	770	0.3203
0.278	2.5490	780	0.3200
0.3353	2.5817	790	0.3197
0.3314	2.6144	800	0.3198
0.2688	2.6471	810	0.3197
0.302	2.6797	820	0.3194
0.2843	2.7124	830	0.3195
0.3105	2.7451	840	0.3190
0.276	2.7778	850	0.3193
0.3206	2.8105	860	0.3192
0.3011	2.8431	870	0.3191
0.3367	2.8758	880	0.3189
0.2918	2.9085	890	0.3184
0.3343	2.9412	900	0.3187
0.2801	2.9739	910	0.3185
0.2959	3.0065	920	0.3185
0.3392	3.0392	930	0.3186
0.3197	3.0719	940	0.3182
0.2919	3.1046	950	0.3181
0.3544	3.1373	960	0.3182
0.2779	3.1699	970	0.3180
0.3001	3.2026	980	0.3180
0.3102	3.2353	990	0.3181
0.3152	3.2680	1000	0.3182
0.2962	3.3007	1010	0.3179
0.2831	3.3333	1020	0.3177
0.3103	3.3660	1030	0.3179
0.2766	3.3987	1040	0.3175
0.295	3.4314	1050	0.3175
0.3139	3.4641	1060	0.3176
0.299	3.4967	1070	0.3173
0.3034	3.5294	1080	0.3170
0.3052	3.5621	1090	0.3170
0.2937	3.5948	1100	0.3170
0.3046	3.6275	1110	0.3170
0.3094	3.6601	1120	0.3171
0.2875	3.6928	1130	0.3169
0.2847	3.7255	1140	0.3169
0.2947	3.7582	1150	0.3171
0.2925	3.7908	1160	0.3168
0.2938	3.8235	1170	0.3167
0.2955	3.8562	1180	0.3167
0.333	3.8889	1190	0.3167
0.3391	3.9216	1200	0.3165
0.2887	3.9542	1210	0.3166
0.3067	3.9869	1220	0.3163
0.3349	4.0196	1230	0.3164
0.308	4.0523	1240	0.3162
0.3252	4.0850	1250	0.3163
0.3077	4.1176	1260	0.3162
0.3198	4.1503	1270	0.3162
0.2891	4.1830	1280	0.3162
0.2712	4.2157	1290	0.3162
0.3083	4.2484	1300	0.3162
0.3032	4.2810	1310	0.3161
0.3024	4.3137	1320	0.3159
0.2966	4.3464	1330	0.3160
0.3046	4.3791	1340	0.3159
0.284	4.4118	1350	0.3158
0.2885	4.4444	1360	0.3157
0.2951	4.4771	1370	0.3158
0.2772	4.5098	1380	0.3157
0.305	4.5425	1390	0.3156
0.2834	4.5752	1400	0.3156
0.3365	4.6078	1410	0.3157
0.3128	4.6405	1420	0.3158
0.3004	4.6732	1430	0.3157
0.2844	4.7059	1440	0.3156
0.3193	4.7386	1450	0.3155
0.3053	4.7712	1460	0.3156
0.2961	4.8039	1470	0.3156
0.2999	4.8366	1480	0.3155
0.2644	4.8693	1490	0.3155
0.311	4.9020	1500	0.3155
0.3044	4.9346	1510	0.3155
0.3	4.9673	1520	0.3156
0.3378	5.0	1530	0.3154

Framework versions

PEFT 0.12.0
Transformers 4.44.2
Pytorch 2.4.0+cu121
Datasets 2.21.0
Tokenizers 0.19.1

Cem13
/

Mixtral_Alpace_v2

Mixtral_Alpace_v2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Cem13/Mixtral_Alpace_v2

Evaluation results