train_mrpc_1744902647

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1153
  • Num Input Tokens Seen: 65784064

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.3
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.178 0.9685 200 0.2087 329312
0.1823 1.9395 400 0.1658 658560
0.1749 2.9104 600 0.1568 987040
0.1421 3.8814 800 0.1488 1316448
0.1589 4.8523 1000 0.1386 1644608
0.1413 5.8232 1200 0.1437 1974016
0.1574 6.7942 1400 0.1415 2303584
0.1059 7.7651 1600 0.1394 2630688
0.1251 8.7361 1800 0.1364 2959808
0.1088 9.7070 2000 0.1153 3287584
0.0996 10.6780 2200 0.1288 3617920
0.0731 11.6489 2400 0.1425 3945536
0.0462 12.6199 2600 0.1264 4274560
0.0608 13.5908 2800 0.1430 4603168
0.022 14.5617 3000 0.1643 4932448
0.0288 15.5327 3200 0.1873 5261312
0.0596 16.5036 3400 0.1731 5589632
0.0199 17.4746 3600 0.2195 5918112
0.0126 18.4455 3800 0.1945 6246368
0.0278 19.4165 4000 0.2397 6574848
0.0266 20.3874 4200 0.2409 6903520
0.005 21.3584 4400 0.2498 7231904
0.0084 22.3293 4600 0.2794 7561504
0.0394 23.3002 4800 0.3202 7890912
0.0073 24.2712 5000 0.2466 8218592
0.0014 25.2421 5200 0.2287 8548256
0.0264 26.2131 5400 0.2157 8876704
0.0123 27.1840 5600 0.2764 9206272
0.0033 28.1550 5800 0.2357 9534720
0.0057 29.1259 6000 0.2608 9864384
0.0052 30.0969 6200 0.2388 10193376
0.0011 31.0678 6400 0.3327 10521952
0.0081 32.0387 6600 0.3200 10851520
0.0102 33.0097 6800 0.2724 11180544
0.0056 33.9782 7000 0.3153 11509344
0.0106 34.9492 7200 0.3401 11838208
0.0221 35.9201 7400 0.2370 12167872
0.0417 36.8910 7600 0.2576 12496352
0.017 37.8620 7800 0.2671 12826048
0.0308 38.8329 8000 0.2404 13155040
0.0004 39.8039 8200 0.3460 13483008
0.0026 40.7748 8400 0.3096 13812064
0.0096 41.7458 8600 0.2907 14140576
0.0053 42.7167 8800 0.3575 14469248
0.0004 43.6877 9000 0.3422 14796672
0.0004 44.6586 9200 0.3874 15126752
0.0001 45.6295 9400 0.4214 15456160
0.0 46.6005 9600 0.4649 15784928
0.0 47.5714 9800 0.4702 16113248
0.0 48.5424 10000 0.4790 16442496
0.0 49.5133 10200 0.4879 16772640
0.0 50.4843 10400 0.4982 17100000
0.0 51.4552 10600 0.5047 17428768
0.0 52.4262 10800 0.5145 17757344
0.0 53.3971 11000 0.5212 18085920
0.0 54.3680 11200 0.5300 18414336
0.0 55.3390 11400 0.5361 18743040
0.0 56.3099 11600 0.5398 19072928
0.0 57.2809 11800 0.5495 19401376
0.0 58.2518 12000 0.5575 19730336
0.0 59.2228 12200 0.5617 20059488
0.0 60.1937 12400 0.5684 20388064
0.0 61.1646 12600 0.5766 20718144
0.0 62.1356 12800 0.5837 21048224
0.0 63.1065 13000 0.5898 21376576
0.0 64.0775 13200 0.5944 21706080
0.0 65.0484 13400 0.6017 22034624
0.0 66.0194 13600 0.6081 22364128
0.0 66.9879 13800 0.6153 22692352
0.0 67.9588 14000 0.6185 23020864
0.0 68.9298 14200 0.6256 23349920
0.0 69.9007 14400 0.6316 23679072
0.0 70.8717 14600 0.6375 24007776
0.0 71.8426 14800 0.6423 24336640
0.0 72.8136 15000 0.6482 24664576
0.0 73.7845 15200 0.6532 24994848
0.0 74.7554 15400 0.6600 25322720
0.0 75.7264 15600 0.6636 25650784
0.0 76.6973 15800 0.6717 25980512
0.0 77.6683 16000 0.6783 26309536
0.0 78.6392 16200 0.6823 26638944
0.0 79.6102 16400 0.6852 26967360
0.0 80.5811 16600 0.6886 27297120
0.0 81.5521 16800 0.6960 27626144
0.0 82.5230 17000 0.6995 27954656
0.0 83.4939 17200 0.7026 28284160
0.0 84.4649 17400 0.7082 28612224
0.0 85.4358 17600 0.7175 28940448
0.0 86.4068 17800 0.7151 29270912
0.0 87.3777 18000 0.7190 29599424
0.0 88.3487 18200 0.7216 29929280
0.0 89.3196 18400 0.7255 30257504
0.0 90.2906 18600 0.7287 30586944
0.0 91.2615 18800 0.7326 30915744
0.0 92.2324 19000 0.7370 31245216
0.0 93.2034 19200 0.7390 31573600
0.0 94.1743 19400 0.7438 31903616
0.0 95.1453 19600 0.7448 32232032
0.0 96.1162 19800 0.7473 32560480
0.0 97.0872 20000 0.7519 32889696
0.0 98.0581 20200 0.7500 33218016
0.0 99.0291 20400 0.7546 33547296
0.0 99.9976 20600 0.7566 33876000
0.0 100.9685 20800 0.7594 34205376
0.0 101.9395 21000 0.7605 34534496
0.0 102.9104 21200 0.7605 34864000
0.0 103.8814 21400 0.7587 35192256
0.0 104.8523 21600 0.7634 35521376
0.0 105.8232 21800 0.7686 35851264
0.0 106.7942 22000 0.7694 36180000
0.0 107.7651 22200 0.7666 36508832
0.0 108.7361 22400 0.7679 36837600
0.0 109.7070 22600 0.7702 37166720
0.0 110.6780 22800 0.7690 37495520
0.0 111.6489 23000 0.7686 37824352
0.0 112.6199 23200 0.7735 38153856
0.0 113.5908 23400 0.7741 38483200
0.0 114.5617 23600 0.7726 38812672
0.0 115.5327 23800 0.7704 39142400
0.0 116.5036 24000 0.7778 39471200
0.0 117.4746 24200 0.7778 39798848
0.0 118.4455 24400 0.7782 40127360
0.0 119.4165 24600 0.7768 40456736
0.0 120.3874 24800 0.7763 40785312
0.0 121.3584 25000 0.7760 41112576
0.0 122.3293 25200 0.7755 41442112
0.0 123.3002 25400 0.7797 41771552
0.0 124.2712 25600 0.7775 42101248
0.0 125.2421 25800 0.7784 42427392
0.0 126.2131 26000 0.7776 42756704
0.0 127.1840 26200 0.7778 43085664
0.0 128.1550 26400 0.7804 43414240
0.0 129.1259 26600 0.7818 43743072
0.0 130.0969 26800 0.7803 44072768
0.0 131.0678 27000 0.7818 44400192
0.0 132.0387 27200 0.7777 44729632
0.0 133.0097 27400 0.7763 45058976
0.0 133.9782 27600 0.7758 45388352
0.0 134.9492 27800 0.7759 45717952
0.0 135.9201 28000 0.7753 46046144
0.0 136.8910 28200 0.7770 46375168
0.0 137.8620 28400 0.7772 46702816
0.0 138.8329 28600 0.7731 47033152
0.0 139.8039 28800 0.7766 47361472
0.0 140.7748 29000 0.7751 47691424
0.0 141.7458 29200 0.7738 48019712
0.0 142.7167 29400 0.7735 48348832
0.0 143.6877 29600 0.7778 48678560
0.0 144.6586 29800 0.7774 49008256
0.0 145.6295 30000 0.7776 49337088
0.0 146.6005 30200 0.7748 49665344
0.0 147.5714 30400 0.7795 49996128
0.0 148.5424 30600 0.7759 50324736
0.0 149.5133 30800 0.7778 50652864
0.0 150.4843 31000 0.7747 50981920
0.0 151.4552 31200 0.7766 51310752
0.0 152.4262 31400 0.7740 51640352
0.0 153.3971 31600 0.7767 51969184
0.0 154.3680 31800 0.7749 52297280
0.0 155.3390 32000 0.7770 52625600
0.0 156.3099 32200 0.7763 52953920
0.0 157.2809 32400 0.7756 53283648
0.0 158.2518 32600 0.7734 53613056
0.0 159.2228 32800 0.7745 53941632
0.0 160.1937 33000 0.7744 54270272
0.0 161.1646 33200 0.7751 54599104
0.0 162.1356 33400 0.7762 54929056
0.0 163.1065 33600 0.7753 55257728
0.0 164.0775 33800 0.7750 55587456
0.0 165.0484 34000 0.7761 55916576
0.0 166.0194 34200 0.7766 56245664
0.0 166.9879 34400 0.7779 56574272
0.0 167.9588 34600 0.7757 56903360
0.0 168.9298 34800 0.7782 57232032
0.0 169.9007 35000 0.7748 57561504
0.0 170.8717 35200 0.7741 57891168
0.0 171.8426 35400 0.7750 58220352
0.0 172.8136 35600 0.7764 58548960
0.0 173.7845 35800 0.7737 58878688
0.0 174.7554 36000 0.7754 59207104
0.0 175.7264 36200 0.7763 59536800
0.0 176.6973 36400 0.7769 59865312
0.0 177.6683 36600 0.7765 60194816
0.0 178.6392 36800 0.7797 60523584
0.0 179.6102 37000 0.7767 60852352
0.0 180.5811 37200 0.7763 61181024
0.0 181.5521 37400 0.7752 61510624
0.0 182.5230 37600 0.7787 61840672
0.0 183.4939 37800 0.7763 62167808
0.0 184.4649 38000 0.7755 62496960
0.0 185.4358 38200 0.7765 62826016
0.0 186.4068 38400 0.7740 63154784
0.0 187.3777 38600 0.7765 63483904
0.0 188.3487 38800 0.7759 63811808
0.0 189.3196 39000 0.7742 64139488
0.0 190.2906 39200 0.7752 64467808
0.0 191.2615 39400 0.7753 64798112
0.0 192.2324 39600 0.7759 65126304
0.0 193.2034 39800 0.7780 65455776
0.0 194.1743 40000 0.7753 65784064

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_1744902647

Adapter
(973)
this model