OpenHermes-2.5-Mistral-7B-JEP

This model is a fine-tuned version of teknium/OpenHermes-2.5-Mistral-7B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.8782

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 1
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 4
optimizer: Use paged_adamw_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
1.0446	0.1536	100	1.0660
1.0028	0.3071	200	1.0076
0.9928	0.4607	300	0.9807
0.9501	0.6142	400	0.9642
0.9755	0.7678	500	0.9480
0.9569	0.9213	600	0.9276
0.831	1.0737	700	0.9001
0.8519	1.2273	800	0.8891
0.929	1.3808	900	0.8798
0.9013	1.5344	1000	0.8755
0.8908	1.6879	1100	0.8724
0.8398	1.8415	1200	0.8684
0.8601	1.9950	1300	0.8636
0.8309	2.1474	1400	0.8659
0.8484	2.3010	1500	0.8624
0.8051	2.4545	1600	0.8593
0.786	2.6081	1700	0.8581
0.8941	2.7616	1800	0.8552
0.7945	2.9152	1900	0.8531
0.8118	3.0676	2000	0.8578
0.7879	3.2211	2100	0.8552
0.7909	3.3747	2200	0.8522
0.8119	3.5282	2300	0.8515
0.7983	3.6818	2400	0.8529
0.784	3.8353	2500	0.8492
0.8524	3.9889	2600	0.8476
0.7776	4.1413	2700	0.8522
0.7702	4.2948	2800	0.8516
0.8429	4.4484	2900	0.8508
0.7716	4.6019	3000	0.8484
0.7694	4.7555	3100	0.8501
0.7423	4.9090	3200	0.8490
0.7233	5.0614	3300	0.8531
0.7353	5.2150	3400	0.8563
0.6947	5.3685	3500	0.8551
0.702	5.5221	3600	0.8557
0.7445	5.6756	3700	0.8509
0.7553	5.8292	3800	0.8521
0.7937	5.9827	3900	0.8538
0.6512	6.1351	4000	0.8611
0.7081	6.2887	4100	0.8607
0.6879	6.4422	4200	0.8586
0.7151	6.5958	4300	0.8594
0.7207	6.7493	4400	0.8569
0.693	6.9029	4500	0.8601
0.7079	7.0553	4600	0.8644
0.7143	7.2088	4700	0.8692
0.7219	7.3624	4800	0.8661
0.6574	7.5159	4900	0.8673
0.6567	7.6695	5000	0.8644
0.671	7.8230	5100	0.8680
0.6771	7.9766	5200	0.8638
0.6429	8.1290	5300	0.8751
0.6931	8.2825	5400	0.8729
0.7037	8.4361	5500	0.8732
0.685	8.5896	5600	0.8736
0.7175	8.7432	5700	0.8734
0.6651	8.8967	5800	0.8719
0.6482	9.0491	5900	0.8764
0.6783	9.2027	6000	0.8766
0.6237	9.3562	6100	0.8801
0.6819	9.5098	6200	0.8780
0.7051	9.6633	6300	0.8778
0.6521	9.8169	6400	0.8783
0.6796	9.9704	6500	0.8782

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.6.0+cu126
Datasets 3.5.0
Tokenizers 0.21.1

raulgdp
/

OpenHermes-2.5-Mistral-7B-JEP

OpenHermes-2.5-Mistral-7B-JEP

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for raulgdp/OpenHermes-2.5-Mistral-7B-JEP

Evaluation results