train_mrpc_1744902643

This model is a fine-tuned version of google/gemma-3-1b-it on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1527
  • Num Input Tokens Seen: 68544800

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.3
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.161 0.9685 200 0.1907 342592
0.1822 1.9395 400 0.1594 685504
0.1611 2.9104 600 0.1662 1027680
0.1791 3.8814 800 0.1607 1371040
0.1444 4.8523 1000 0.1527 1713440
0.1477 5.8232 1200 0.1597 2056384
0.1451 6.7942 1400 0.1569 2400544
0.1515 7.7651 1600 0.1735 2741344
0.1643 8.7361 1800 0.1636 3083872
0.1533 9.7070 2000 0.1658 3425696
0.1429 10.6780 2200 0.1588 3769888
0.1556 11.6489 2400 0.1824 4110336
0.1218 12.6199 2600 0.1686 4453600
0.1527 13.5908 2800 0.1947 4796192
0.0872 14.5617 3000 0.1860 5138720
0.1445 15.5327 3200 0.2100 5480512
0.1433 16.5036 3400 0.1877 5822816
0.1014 17.4746 3600 0.2470 6165056
0.1101 18.4455 3800 0.2416 6507264
0.0521 19.4165 4000 0.2813 6849792
0.0438 20.3874 4200 0.2733 7192864
0.1356 21.3584 4400 0.3191 7534272
0.0441 22.3293 4600 0.3034 7877248
0.041 23.3002 4800 0.3460 8220544
0.024 24.2712 5000 0.3483 8562144
0.0235 25.2421 5200 0.3925 8905568
0.0111 26.2131 5400 0.3959 9248640
0.0355 27.1840 5600 0.3593 9592608
0.0166 28.1550 5800 0.3473 9933568
0.0216 29.1259 6000 0.3824 10277088
0.0187 30.0969 6200 0.4093 10619488
0.0332 31.0678 6400 0.3896 10962112
0.0287 32.0387 6600 0.3759 11306080
0.0515 33.0097 6800 0.3481 11649024
0.0241 33.9782 7000 0.4164 11992032
0.023 34.9492 7200 0.3968 12334784
0.0699 35.9201 7400 0.3400 12677888
0.0311 36.8910 7600 0.3645 13020640
0.0095 37.8620 7800 0.4563 13363648
0.0138 38.8329 8000 0.4573 13706752
0.0358 39.8039 8200 0.3834 14048256
0.0293 40.7748 8400 0.3833 14392064
0.0119 41.7458 8600 0.4786 14733504
0.001 42.7167 8800 0.5092 15076736
0.0036 43.6877 9000 0.4884 15418176
0.0258 44.6586 9200 0.5242 15762912
0.0188 45.6295 9400 0.4023 16105760
0.0254 46.6005 9600 0.4033 16448096
0.0109 47.5714 9800 0.4908 16790336
0.0044 48.5424 10000 0.4366 17132896
0.0034 49.5133 10200 0.4603 17477376
0.007 50.4843 10400 0.5221 17817792
0.003 51.4552 10600 0.5243 18160384
0.0013 52.4262 10800 0.6114 18502784
0.0128 53.3971 11000 0.5867 18845184
0.0076 54.3680 11200 0.5882 19187296
0.042 55.3390 11400 0.4477 19529792
0.0147 56.3099 11600 0.5452 19873728
0.0166 57.2809 11800 0.5037 20215680
0.0022 58.2518 12000 0.5719 20558624
0.0004 59.2228 12200 0.5728 20901984
0.0018 60.1937 12400 0.6163 21244800
0.0004 61.1646 12600 0.6409 21588704
0.0001 62.1356 12800 0.6553 21931872
0.0001 63.1065 13000 0.6692 22274560
0.0 64.0775 13200 0.6789 22618432
0.0 65.0484 13400 0.6877 22961216
0.0 66.0194 13600 0.6961 23304288
0.0 66.9879 13800 0.7061 23646592
0.0 67.9588 14000 0.7122 23989408
0.0 68.9298 14200 0.7210 24332544
0.0 69.9007 14400 0.7238 24675424
0.0 70.8717 14600 0.7328 25017632
0.0 71.8426 14800 0.7408 25360352
0.0 72.8136 15000 0.7473 25701344
0.0 73.7845 15200 0.7552 26046016
0.0 74.7554 15400 0.7630 26388448
0.0 75.7264 15600 0.7694 26729856
0.0 76.6973 15800 0.7714 27072064
0.0 77.6683 16000 0.7861 27415968
0.0 78.6392 16200 0.7885 27759520
0.0 79.6102 16400 0.7974 28101632
0.0 80.5811 16600 0.8027 28446208
0.0 81.5521 16800 0.8075 28787840
0.0 82.5230 17000 0.8158 29129536
0.0 83.4939 17200 0.8239 29473344
0.0 84.4649 17400 0.8303 29815360
0.0 85.4358 17600 0.8376 30157632
0.0 86.4068 17800 0.8439 30501440
0.0 87.3777 18000 0.8497 30843072
0.0 88.3487 18200 0.8595 31187360
0.0 89.3196 18400 0.8655 31528480
0.0 90.2906 18600 0.8731 31872544
0.0 91.2615 18800 0.8824 32214560
0.0 92.2324 19000 0.8885 32558112
0.0 93.2034 19200 0.8940 32900448
0.0 94.1743 19400 0.9026 33244800
0.0 95.1453 19600 0.9150 33587168
0.0 96.1162 19800 0.9225 33929248
0.0 97.0872 20000 0.9254 34271648
0.0 98.0581 20200 0.9303 34613344
0.0 99.0291 20400 0.9378 34957056
0.0 99.9976 20600 0.9486 35299200
0.0 100.9685 20800 0.9553 35642464
0.0 101.9395 21000 0.9646 35985280
0.0 102.9104 21200 0.9644 36327840
0.0 103.8814 21400 0.9756 36669664
0.0 104.8523 21600 0.9847 37012960
0.0 105.8232 21800 0.9931 37355968
0.0 106.7942 22000 1.0025 37698112
0.0 107.7651 22200 1.0076 38040768
0.0 108.7361 22400 1.0138 38383744
0.0 109.7070 22600 1.0152 38726880
0.0 110.6780 22800 1.0302 39068512
0.0 111.6489 23000 1.0302 39411712
0.0 112.6199 23200 1.0323 39754784
0.0 113.5908 23400 1.0441 40097568
0.0 114.5617 23600 1.0523 40441152
0.0 115.5327 23800 1.0557 40784672
0.0 116.5036 24000 1.0628 41127232
0.0 117.4746 24200 1.0709 41468768
0.0 118.4455 24400 1.0707 41811328
0.0 119.4165 24600 1.0784 42154688
0.0 120.3874 24800 1.0863 42497024
0.0 121.3584 25000 1.0897 42838112
0.0 122.3293 25200 1.0971 43181600
0.0 123.3002 25400 1.0959 43524256
0.0 124.2712 25600 1.0986 43867840
0.0 125.2421 25800 1.1098 44207680
0.0 126.2131 26000 1.1184 44551232
0.0 127.1840 26200 1.1156 44894816
0.0 128.1550 26400 1.1208 45236928
0.0 129.1259 26600 1.1199 45579584
0.0 130.0969 26800 1.1278 45923328
0.0 131.0678 27000 1.1318 46264032
0.0 132.0387 27200 1.1411 46607776
0.0 133.0097 27400 1.1394 46950752
0.0 133.9782 27600 1.1429 47293824
0.0 134.9492 27800 1.1487 47637248
0.0 135.9201 28000 1.1461 47979552
0.0 136.8910 28200 1.1441 48322528
0.0 137.8620 28400 1.1561 48663488
0.0 138.8329 28600 1.1517 49008000
0.0 139.8039 28800 1.1507 49350304
0.0 140.7748 29000 1.1619 49694528
0.0 141.7458 29200 1.1625 50035616
0.0 142.7167 29400 1.1646 50378912
0.0 143.6877 29600 1.1668 50722400
0.0 144.6586 29800 1.1662 51064768
0.0 145.6295 30000 1.1683 51407840
0.0 146.6005 30200 1.1743 51749792
0.0 147.5714 30400 1.1748 52094304
0.0 148.5424 30600 1.1746 52436000
0.0 149.5133 30800 1.1759 52777984
0.0 150.4843 31000 1.1771 53119904
0.0 151.4552 31200 1.1726 53462560
0.0 152.4262 31400 1.1877 53806272
0.0 153.3971 31600 1.1818 54148640
0.0 154.3680 31800 1.1787 54489984
0.0 155.3390 32000 1.1769 54832032
0.0 156.3099 32200 1.1839 55173664
0.0 157.2809 32400 1.1904 55517376
0.0 158.2518 32600 1.1881 55861088
0.0 159.2228 32800 1.1936 56203392
0.0 160.1937 33000 1.1827 56545632
0.0 161.1646 33200 1.1905 56888352
0.0 162.1356 33400 1.1879 57231584
0.0 163.1065 33600 1.1913 57574112
0.0 164.0775 33800 1.1953 57917728
0.0 165.0484 34000 1.1933 58261184
0.0 166.0194 34200 1.1970 58604352
0.0 166.9879 34400 1.1914 58946112
0.0 167.9588 34600 1.1899 59289344
0.0 168.9298 34800 1.1923 59631584
0.0 169.9007 35000 1.1975 59974880
0.0 170.8717 35200 1.1925 60318560
0.0 171.8426 35400 1.1931 60662016
0.0 172.8136 35600 1.1948 61004352
0.0 173.7845 35800 1.1955 61347296
0.0 174.7554 36000 1.1908 61689824
0.0 175.7264 36200 1.1990 62033792
0.0 176.6973 36400 1.1992 62376224
0.0 177.6683 36600 1.1940 62720096
0.0 178.6392 36800 1.1968 63062656
0.0 179.6102 37000 1.1963 63405504
0.0 180.5811 37200 1.1959 63748768
0.0 181.5521 37400 1.1894 64092416
0.0 182.5230 37600 1.1997 64436992
0.0 183.4939 37800 1.1968 64777984
0.0 184.4649 38000 1.1958 65120224
0.0 185.4358 38200 1.1940 65462240
0.0 186.4068 38400 1.1946 65805504
0.0 187.3777 38600 1.1984 66148448
0.0 188.3487 38800 1.1946 66490240
0.0 189.3196 39000 1.1953 66832256
0.0 190.2906 39200 1.1945 67174336
0.0 191.2615 39400 1.2012 67517920
0.0 192.2324 39600 1.2013 67860384
0.0 193.2034 39800 1.1949 68203104
0.0 194.1743 40000 1.1949 68544800

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_1744902643

Adapter
(112)
this model