OpenHermes-2.5-Mistral-7B-JEP

This model is a fine-tuned version of teknium/OpenHermes-2.5-Mistral-7B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8782

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Use paged_adamw_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.0446 0.1536 100 1.0660
1.0028 0.3071 200 1.0076
0.9928 0.4607 300 0.9807
0.9501 0.6142 400 0.9642
0.9755 0.7678 500 0.9480
0.9569 0.9213 600 0.9276
0.831 1.0737 700 0.9001
0.8519 1.2273 800 0.8891
0.929 1.3808 900 0.8798
0.9013 1.5344 1000 0.8755
0.8908 1.6879 1100 0.8724
0.8398 1.8415 1200 0.8684
0.8601 1.9950 1300 0.8636
0.8309 2.1474 1400 0.8659
0.8484 2.3010 1500 0.8624
0.8051 2.4545 1600 0.8593
0.786 2.6081 1700 0.8581
0.8941 2.7616 1800 0.8552
0.7945 2.9152 1900 0.8531
0.8118 3.0676 2000 0.8578
0.7879 3.2211 2100 0.8552
0.7909 3.3747 2200 0.8522
0.8119 3.5282 2300 0.8515
0.7983 3.6818 2400 0.8529
0.784 3.8353 2500 0.8492
0.8524 3.9889 2600 0.8476
0.7776 4.1413 2700 0.8522
0.7702 4.2948 2800 0.8516
0.8429 4.4484 2900 0.8508
0.7716 4.6019 3000 0.8484
0.7694 4.7555 3100 0.8501
0.7423 4.9090 3200 0.8490
0.7233 5.0614 3300 0.8531
0.7353 5.2150 3400 0.8563
0.6947 5.3685 3500 0.8551
0.702 5.5221 3600 0.8557
0.7445 5.6756 3700 0.8509
0.7553 5.8292 3800 0.8521
0.7937 5.9827 3900 0.8538
0.6512 6.1351 4000 0.8611
0.7081 6.2887 4100 0.8607
0.6879 6.4422 4200 0.8586
0.7151 6.5958 4300 0.8594
0.7207 6.7493 4400 0.8569
0.693 6.9029 4500 0.8601
0.7079 7.0553 4600 0.8644
0.7143 7.2088 4700 0.8692
0.7219 7.3624 4800 0.8661
0.6574 7.5159 4900 0.8673
0.6567 7.6695 5000 0.8644
0.671 7.8230 5100 0.8680
0.6771 7.9766 5200 0.8638
0.6429 8.1290 5300 0.8751
0.6931 8.2825 5400 0.8729
0.7037 8.4361 5500 0.8732
0.685 8.5896 5600 0.8736
0.7175 8.7432 5700 0.8734
0.6651 8.8967 5800 0.8719
0.6482 9.0491 5900 0.8764
0.6783 9.2027 6000 0.8766
0.6237 9.3562 6100 0.8801
0.6819 9.5098 6200 0.8780
0.7051 9.6633 6300 0.8778
0.6521 9.8169 6400 0.8783
0.6796 9.9704 6500 0.8782

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu126
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for raulgdp/OpenHermes-2.5-Mistral-7B-JEP

Adapter
(434)
this model