Edit model card

working

This model is a fine-tuned version of TheBloke/Mistral-7B-Instruct-v0.2-GPTQ on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6079

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 6
  • eval_batch_size: 6
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 24
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
4.3979 0.96 6 3.3561
2.837 1.92 12 2.2656
1.9777 2.88 18 1.7212
1.3641 4.0 25 1.4591
1.3384 4.96 31 1.2543
1.1314 5.92 37 1.1326
0.9904 6.88 43 1.0707
0.7908 8.0 50 1.0784
0.8779 8.96 56 1.0891
0.8415 9.92 62 1.1026
0.8044 10.88 68 1.1326
0.6611 12.0 75 1.1425
0.7385 12.96 81 1.2161
0.7071 13.92 87 1.2182
0.6841 14.88 93 1.2865
0.5671 16.0 100 1.3092
0.6442 16.96 106 1.3813
0.629 17.92 112 1.3295
0.6197 18.88 118 1.4387
0.522 20.0 125 1.3785
0.6013 20.96 131 1.4355
0.5928 21.92 137 1.4321
0.5901 22.88 143 1.4711
0.5015 24.0 150 1.4916
0.5817 24.96 156 1.5001
0.578 25.92 162 1.5077
0.5758 26.88 168 1.5173
0.4914 28.0 175 1.4935
0.5732 28.96 181 1.5161
0.5715 29.92 187 1.5131
0.5696 30.88 193 1.5400
0.4861 32.0 200 1.5338
0.5666 32.96 206 1.5474
0.5643 33.92 212 1.5519
0.5643 34.88 218 1.5710
0.4819 36.0 225 1.5723
0.5607 36.96 231 1.5749
0.5609 37.92 237 1.5677
0.5598 38.88 243 1.5853
0.4793 40.0 250 1.5951
0.5587 40.96 256 1.5850
0.5577 41.92 262 1.5904
0.5568 42.88 268 1.5913
0.477 44.0 275 1.5959
0.5553 44.96 281 1.6042
0.5556 45.92 287 1.6082
0.5549 46.88 293 1.6075
0.4749 48.0 300 1.6079

Framework versions

  • PEFT 0.10.0
  • Transformers 4.39.3
  • Pytorch 2.1.2
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
0
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for mohit19906/working