train_copa_1745950323

This model is a fine-tuned version of google/gemma-3-1b-it on the copa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1725
  • Num Input Tokens Seen: 11200800

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1727 2.2222 200 0.1756 56176
0.1686 4.4444 400 0.1725 112064
0.061 6.6667 600 0.2359 168112
0.0765 8.8889 800 0.3596 224048
0.0003 11.1111 1000 0.4669 279904
0.0 13.3333 1200 0.6947 336032
0.0 15.5556 1400 0.7523 391904
0.0 17.7778 1600 0.7925 448112
0.0 20.0 1800 0.8222 503920
0.0 22.2222 2000 0.8503 560016
0.0 24.4444 2200 0.8652 615952
0.0 26.6667 2400 0.8762 672080
0.0 28.8889 2600 0.8936 728000
0.0 31.1111 2800 0.9060 783872
0.0 33.3333 3000 0.9416 839808
0.0 35.5556 3200 0.9402 896064
0.0 37.7778 3400 0.9591 951888
0.0 40.0 3600 0.9698 1007760
0.0 42.2222 3800 0.9947 1063648
0.0 44.4444 4000 0.9982 1119744
0.0 46.6667 4200 1.0181 1175680
0.0 48.8889 4400 1.0073 1231696
0.0 51.1111 4600 1.0176 1287744
0.0 53.3333 4800 1.0251 1343760
0.0 55.5556 5000 1.0433 1399856
0.0 57.7778 5200 1.0433 1455808
0.0 60.0 5400 1.0580 1511856
0.0 62.2222 5600 1.0587 1567808
0.0 64.4444 5800 1.0723 1623744
0.0 66.6667 6000 1.0874 1679888
0.0 68.8889 6200 1.0618 1735952
0.0 71.1111 6400 1.0960 1791904
0.0 73.3333 6600 1.1207 1847888
0.0 75.5556 6800 1.1112 1904000
0.0 77.7778 7000 1.1231 1959920
0.0 80.0 7200 1.1284 2015792
0.0 82.2222 7400 1.1410 2071808
0.0 84.4444 7600 1.1386 2127808
0.0 86.6667 7800 1.1319 2183888
0.0 88.8889 8000 1.1530 2239840
0.0 91.1111 8200 1.1677 2295888
0.0 93.3333 8400 1.1811 2351872
0.0 95.5556 8600 1.1916 2407824
0.0 97.7778 8800 1.1786 2463744
0.0 100.0 9000 1.2107 2519680
0.0 102.2222 9200 1.1827 2575584
0.0 104.4444 9400 1.1796 2631680
0.0 106.6667 9600 1.2282 2687728
0.0 108.8889 9800 1.2008 2743792
0.0 111.1111 10000 1.1935 2799840
0.0 113.3333 10200 1.1934 2855808
0.0 115.5556 10400 1.2338 2911648
0.0 117.7778 10600 1.2305 2967856
0.0 120.0 10800 1.1921 3023792
0.0 122.2222 11000 1.2243 3079920
0.0 124.4444 11200 1.2163 3135904
0.0 126.6667 11400 1.2182 3191808
0.0 128.8889 11600 1.2363 3247840
0.0 131.1111 11800 1.2182 3303712
0.0 133.3333 12000 1.2518 3359680
0.0 135.5556 12200 1.1946 3415824
0.0 137.7778 12400 1.2288 3471520
0.0 140.0 12600 1.2415 3527664
0.0 142.2222 12800 1.1985 3583696
0.0 144.4444 13000 1.2051 3639680
0.0 146.6667 13200 1.2130 3695712
0.0 148.8889 13400 1.1997 3751728
0.0 151.1111 13600 1.2459 3807744
0.0 153.3333 13800 1.2529 3863664
0.0 155.5556 14000 1.2405 3919584
0.0 157.7778 14200 1.2380 3975568
0.0 160.0 14400 1.2566 4031632
0.0 162.2222 14600 1.2693 4087632
0.0 164.4444 14800 1.2770 4143664
0.0 166.6667 15000 1.2715 4199552
0.0 168.8889 15200 1.2732 4255584
0.0 171.1111 15400 1.2954 4311504
0.0 173.3333 15600 1.2588 4367408
0.0 175.5556 15800 1.2670 4423376
0.0 177.7778 16000 1.2695 4479456
0.0 180.0 16200 1.3130 4535504
0.0 182.2222 16400 1.2977 4591504
0.0 184.4444 16600 1.2993 4647424
0.0 186.6667 16800 1.3115 4703376
0.0 188.8889 17000 1.3025 4759552
0.0 191.1111 17200 1.2985 4815552
0.0 193.3333 17400 1.3269 4871600
0.0 195.5556 17600 1.2977 4927696
0.0 197.7778 17800 1.3144 4983424
0.0 200.0 18000 1.2850 5039536
0.0 202.2222 18200 1.3128 5095376
0.0 204.4444 18400 1.3168 5151440
0.0 206.6667 18600 1.3172 5207488
0.0 208.8889 18800 1.2778 5263360
0.0 211.1111 19000 1.3527 5319344
0.0 213.3333 19200 1.3433 5375280
0.0 215.5556 19400 1.2959 5431520
0.0 217.7778 19600 1.3496 5487472
0.0 220.0 19800 1.3567 5543504
0.0 222.2222 20000 1.3599 5599440
0.0 224.4444 20200 1.4162 5655424
0.0 226.6667 20400 1.4206 5711344
0.0 228.8889 20600 1.3969 5767376
0.0 231.1111 20800 1.4180 5823264
0.0 233.3333 21000 1.4176 5879248
0.0 235.5556 21200 1.4442 5935168
0.0 237.7778 21400 1.4511 5991232
0.0 240.0 21600 1.4869 6047376
0.0 242.2222 21800 1.5168 6103328
0.0 244.4444 22000 1.4851 6159376
0.0 246.6667 22200 1.5620 6215360
0.0 248.8889 22400 1.5264 6271232
0.0 251.1111 22600 1.6044 6327136
0.0 253.3333 22800 1.5396 6383248
0.0 255.5556 23000 1.5610 6439168
0.0 257.7778 23200 1.5009 6495280
0.0 260.0 23400 1.5016 6551264
0.0 262.2222 23600 1.5534 6607424
0.0 264.4444 23800 1.5226 6663168
0.0 266.6667 24000 1.6125 6719216
0.0 268.8889 24200 1.5949 6775344
0.0 271.1111 24400 1.6079 6831344
0.0 273.3333 24600 1.6199 6887344
0.0 275.5556 24800 1.5673 6943632
0.0 277.7778 25000 1.5777 6999632
0.0 280.0 25200 1.6121 7055664
0.0 282.2222 25400 1.6366 7111664
0.0 284.4444 25600 1.6301 7167744
0.0 286.6667 25800 1.5832 7223696
0.0 288.8889 26000 1.5041 7279760
0.0 291.1111 26200 1.5703 7335792
0.0 293.3333 26400 1.5177 7391808
0.0 295.5556 26600 1.5250 7447808
0.0 297.7778 26800 1.4903 7503824
0.0001 300.0 27000 0.8739 7559856
0.0 302.2222 27200 0.6719 7615904
0.0 304.4444 27400 0.7428 7672000
0.0 306.6667 27600 0.7946 7727808
0.0 308.8889 27800 0.8179 7783744
0.0 311.1111 28000 0.8506 7839808
0.0 313.3333 28200 0.8725 7895872
0.0 315.5556 28400 0.8748 7951664
0.0 317.7778 28600 0.9427 8007744
0.0 320.0 28800 0.9171 8063616
0.0 322.2222 29000 0.9534 8119520
0.0 324.4444 29200 0.9511 8175584
0.0 326.6667 29400 0.9984 8231760
0.0 328.8889 29600 0.9725 8287696
0.0 331.1111 29800 1.0153 8343760
0.0 333.3333 30000 1.0551 8399696
0.0 335.5556 30200 1.0328 8455776
0.0 337.7778 30400 1.0402 8511760
0.0 340.0 30600 1.0618 8567792
0.0 342.2222 30800 1.0694 8623728
0.0 344.4444 31000 1.0588 8679920
0.0 346.6667 31200 1.1039 8736032
0.0 348.8889 31400 1.0881 8791888
0.0 351.1111 31600 1.1246 8847728
0.0 353.3333 31800 1.0899 8903952
0.0 355.5556 32000 1.1148 8959920
0.0 357.7778 32200 1.1080 9016096
0.0 360.0 32400 1.1077 9072192
0.0 362.2222 32600 1.1190 9128272
0.0 364.4444 32800 1.1359 9184240
0.0 366.6667 33000 1.1823 9240064
0.0 368.8889 33200 1.1525 9295952
0.0 371.1111 33400 1.1642 9352016
0.0 373.3333 33600 1.1804 9407968
0.0 375.5556 33800 1.1850 9463920
0.0 377.7778 34000 1.1752 9519984
0.0 380.0 34200 1.2011 9575936
0.0 382.2222 34400 1.1528 9631952
0.0 384.4444 34600 1.1656 9687936
0.0 386.6667 34800 1.1614 9743968
0.0 388.8889 35000 1.1943 9800016
0.0 391.1111 35200 1.2018 9856016
0.0 393.3333 35400 1.2036 9912112
0.0 395.5556 35600 1.1697 9968112
0.0 397.7778 35800 1.2243 10024160
0.0 400.0 36000 1.1905 10080240
0.0 402.2222 36200 1.2014 10136208
0.0 404.4444 36400 1.2127 10192208
0.0 406.6667 36600 1.2262 10248192
0.0 408.8889 36800 1.2096 10304144
0.0 411.1111 37000 1.2468 10360192
0.0 413.3333 37200 1.2494 10416288
0.0 415.5556 37400 1.2069 10472368
0.0 417.7778 37600 1.2231 10528352
0.0 420.0 37800 1.2687 10584384
0.0 422.2222 38000 1.2151 10640496
0.0 424.4444 38200 1.2405 10696528
0.0 426.6667 38400 1.2780 10752640
0.0 428.8889 38600 1.2369 10808672
0.0 431.1111 38800 1.2551 10864512
0.0 433.3333 39000 1.2632 10920608
0.0 435.5556 39200 1.2149 10976624
0.0 437.7778 39400 1.2244 11032608
0.0 440.0 39600 1.2335 11088720
0.0 442.2222 39800 1.2725 11144688
0.0 444.4444 40000 1.2688 11200800

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_1745950323

Adapter
(95)
this model