Edit model card

Project5_V3_Mistral8x7b_V2.2.5

This model is a fine-tuned version of mistralai/Mixtral-8x7B-v0.1 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2430

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 128
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
No log 0.63 1 2.3024
No log 1.88 3 2.2997
No log 2.51 4 2.2977
2.2951 3.76 6 2.2935
2.2951 4.39 7 2.2905
2.2951 5.65 9 2.2853
2.2845 6.9 11 2.2789
2.2845 7.53 12 2.2757
2.2845 8.78 14 2.2696
2.2729 9.41 15 2.2667
2.2729 10.67 17 2.2619
2.2729 11.92 19 2.2576
2.2599 12.55 20 2.2555
2.2599 13.8 22 2.2519
2.2599 14.43 23 2.2504
2.2523 15.69 25 2.2479
2.2523 16.94 27 2.2462
2.2523 17.57 28 2.2454
2.2471 18.82 30 2.2442
2.2471 19.45 31 2.2437
2.2471 20.71 33 2.2432
2.2444 21.96 35 2.2427
2.2444 22.59 36 2.2428
2.2444 23.84 38 2.2429
2.2444 24.47 39 2.2429
2.2427 25.1 40 2.2430

Framework versions

  • PEFT 0.8.2
  • Transformers 4.38.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.17.1
  • Tokenizers 0.15.2
Downloads last month
0
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for dominic5/Project5_V3_Mistral8x7b_V2.2.5

Adapter
(89)
this model