train_copa_1745950322

This model is a fine-tuned version of google/gemma-3-1b-it on the copa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1742
  • Num Input Tokens Seen: 11200800

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.3
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1585 2.2222 200 0.1960 56176
0.1537 4.4444 400 0.1932 112064
0.1745 6.6667 600 0.1866 168112
0.1615 8.8889 800 0.1828 224048
0.1705 11.1111 1000 0.1820 279904
0.1756 13.3333 1200 0.1769 336032
0.1546 15.5556 1400 0.1742 391904
0.1698 17.7778 1600 0.1744 448112
0.1734 20.0 1800 0.1756 503920
0.1819 22.2222 2000 0.1746 560016
0.1603 24.4444 2200 0.1819 615952
0.177 26.6667 2400 0.1764 672080
0.176 28.8889 2600 0.1817 728000
0.1629 31.1111 2800 0.1786 783872
0.1461 33.3333 3000 0.1881 839808
0.1762 35.5556 3200 0.1912 896064
0.1672 37.7778 3400 0.1763 951888
0.1666 40.0 3600 0.1825 1007760
0.1347 42.2222 3800 0.2474 1063648
0.1005 44.4444 4000 0.2612 1119744
0.0669 46.6667 4200 0.2604 1175680
0.1043 48.8889 4400 0.2414 1231696
0.0445 51.1111 4600 0.3379 1287744
0.0243 53.3333 4800 0.4504 1343760
0.0371 55.5556 5000 0.5648 1399856
0.0138 57.7778 5200 0.5205 1455808
0.0002 60.0 5400 0.7808 1511856
0.0001 62.2222 5600 0.7838 1567808
0.0 64.4444 5800 0.8040 1623744
0.0 66.6667 6000 0.8280 1679888
0.0 68.8889 6200 0.8334 1735952
0.0 71.1111 6400 0.8474 1791904
0.0 73.3333 6600 0.8694 1847888
0.0 75.5556 6800 0.8714 1904000
0.0 77.7778 7000 0.8972 1959920
0.0 80.0 7200 0.9046 2015792
0.0 82.2222 7400 0.9208 2071808
0.0 84.4444 7600 0.9380 2127808
0.0 86.6667 7800 0.9472 2183888
0.0 88.8889 8000 0.9389 2239840
0.0 91.1111 8200 0.9594 2295888
0.0 93.3333 8400 0.9713 2351872
0.0 95.5556 8600 0.9810 2407824
0.0 97.7778 8800 0.9830 2463744
0.0 100.0 9000 1.0046 2519680
0.0 102.2222 9200 1.0120 2575584
0.0 104.4444 9400 1.0307 2631680
0.0 106.6667 9600 1.0327 2687728
0.0 108.8889 9800 1.0424 2743792
0.0 111.1111 10000 1.0433 2799840
0.0 113.3333 10200 1.0506 2855808
0.0 115.5556 10400 1.0754 2911648
0.0 117.7778 10600 1.0816 2967856
0.0 120.0 10800 1.0875 3023792
0.0 122.2222 11000 1.1016 3079920
0.0 124.4444 11200 1.1077 3135904
0.0 126.6667 11400 1.1129 3191808
0.0 128.8889 11600 1.1311 3247840
0.0 131.1111 11800 1.1461 3303712
0.0 133.3333 12000 1.1524 3359680
0.0 135.5556 12200 1.1528 3415824
0.0 137.7778 12400 1.1764 3471520
0.0 140.0 12600 1.1755 3527664
0.0 142.2222 12800 1.2023 3583696
0.0 144.4444 13000 1.1986 3639680
0.0 146.6667 13200 1.2131 3695712
0.0 148.8889 13400 1.2381 3751728
0.0 151.1111 13600 1.2481 3807744
0.0 153.3333 13800 1.2522 3863664
0.0 155.5556 14000 1.2715 3919584
0.0 157.7778 14200 1.2780 3975568
0.0 160.0 14400 1.3001 4031632
0.0 162.2222 14600 1.3049 4087632
0.0 164.4444 14800 1.3115 4143664
0.0 166.6667 15000 1.3477 4199552
0.0 168.8889 15200 1.3329 4255584
0.0 171.1111 15400 1.3379 4311504
0.0 173.3333 15600 1.3553 4367408
0.0 175.5556 15800 1.3785 4423376
0.0 177.7778 16000 1.3628 4479456
0.0 180.0 16200 1.3936 4535504
0.0 182.2222 16400 1.3908 4591504
0.0 184.4444 16600 1.4268 4647424
0.0 186.6667 16800 1.4218 4703376
0.0 188.8889 17000 1.4472 4759552
0.0 191.1111 17200 1.4649 4815552
0.0 193.3333 17400 1.4669 4871600
0.0 195.5556 17600 1.4431 4927696
0.0 197.7778 17800 1.4888 4983424
0.0 200.0 18000 1.5016 5039536
0.0 202.2222 18200 1.4928 5095376
0.0 204.4444 18400 1.5293 5151440
0.0 206.6667 18600 1.5467 5207488
0.0 208.8889 18800 1.5432 5263360
0.0 211.1111 19000 1.5500 5319344
0.0 213.3333 19200 1.5504 5375280
0.0 215.5556 19400 1.5739 5431520
0.0 217.7778 19600 1.5765 5487472
0.0 220.0 19800 1.5911 5543504
0.0 222.2222 20000 1.5940 5599440
0.0 224.4444 20200 1.5977 5655424
0.0 226.6667 20400 1.6347 5711344
0.0 228.8889 20600 1.6275 5767376
0.0 231.1111 20800 1.6913 5823264
0.0 233.3333 21000 1.6944 5879248
0.0 235.5556 21200 1.6750 5935168
0.0 237.7778 21400 1.6816 5991232
0.0 240.0 21600 1.6530 6047376
0.0 242.2222 21800 1.6663 6103328
0.0 244.4444 22000 1.6708 6159376
0.0 246.6667 22200 1.6437 6215360
0.0 248.8889 22400 1.6692 6271232
0.0 251.1111 22600 1.6101 6327136
0.0 253.3333 22800 1.6198 6383248
0.0 255.5556 23000 1.5668 6439168
0.0 257.7778 23200 1.5763 6495280
0.0 260.0 23400 1.5500 6551264
0.0 262.2222 23600 1.5696 6607424
0.0 264.4444 23800 1.5225 6663168
0.0 266.6667 24000 1.5554 6719216
0.0 268.8889 24200 1.5902 6775344
0.0 271.1111 24400 1.4873 6831344
0.0 273.3333 24600 1.5270 6887344
0.0 275.5556 24800 1.6768 6943632
0.0 277.7778 25000 1.6876 6999632
0.0 280.0 25200 1.5999 7055664
0.0 282.2222 25400 1.6702 7111664
0.0 284.4444 25600 1.6623 7167744
0.0 286.6667 25800 1.5950 7223696
0.0 288.8889 26000 1.6427 7279760
0.0 291.1111 26200 1.7028 7335792
0.0 293.3333 26400 1.6055 7391808
0.0 295.5556 26600 1.5844 7447808
0.0 297.7778 26800 1.6416 7503824
0.0 300.0 27000 1.7239 7559856
0.0 302.2222 27200 1.6796 7615904
0.0 304.4444 27400 1.5742 7672000
0.0 306.6667 27600 1.6184 7727808
0.0 308.8889 27800 1.5991 7783744
0.0 311.1111 28000 1.5418 7839808
0.0 313.3333 28200 1.6170 7895872
0.0 315.5556 28400 1.6391 7951664
0.0 317.7778 28600 1.6250 8007744
0.0 320.0 28800 1.6387 8063616
0.0 322.2222 29000 1.6151 8119520
0.0 324.4444 29200 1.6220 8175584
0.0 326.6667 29400 1.6851 8231760
0.0 328.8889 29600 1.6862 8287696
0.0 331.1111 29800 1.6520 8343760
0.0 333.3333 30000 1.6907 8399696
0.0 335.5556 30200 1.6143 8455776
0.0 337.7778 30400 1.6979 8511760
0.0 340.0 30600 1.6834 8567792
0.0 342.2222 30800 1.6586 8623728
0.0 344.4444 31000 1.6678 8679920
0.0 346.6667 31200 1.7132 8736032
0.0 348.8889 31400 1.7332 8791888
0.0 351.1111 31600 1.6019 8847728
0.0 353.3333 31800 1.6742 8903952
0.0 355.5556 32000 1.7011 8959920
0.0 357.7778 32200 1.6816 9016096
0.0 360.0 32400 1.7048 9072192
0.0 362.2222 32600 1.7152 9128272
0.0 364.4444 32800 1.7085 9184240
0.0 366.6667 33000 1.7312 9240064
0.0 368.8889 33200 1.7539 9295952
0.0 371.1111 33400 1.7249 9352016
0.0 373.3333 33600 1.7504 9407968
0.0 375.5556 33800 1.7512 9463920
0.0 377.7778 34000 1.7510 9519984
0.0 380.0 34200 1.7470 9575936
0.0 382.2222 34400 1.7684 9631952
0.0 384.4444 34600 1.7600 9687936
0.0 386.6667 34800 1.7497 9743968
0.0 388.8889 35000 1.7552 9800016
0.0 391.1111 35200 1.7796 9856016
0.0 393.3333 35400 1.7958 9912112
0.0 395.5556 35600 1.7898 9968112
0.0 397.7778 35800 1.7811 10024160
0.0 400.0 36000 1.7936 10080240
0.0 402.2222 36200 1.7834 10136208
0.0 404.4444 36400 1.7830 10192208
0.0 406.6667 36600 1.8050 10248192
0.0 408.8889 36800 1.7914 10304144
0.0 411.1111 37000 1.8239 10360192
0.0 413.3333 37200 1.7780 10416288
0.0 415.5556 37400 1.7846 10472368
0.0 417.7778 37600 1.7938 10528352
0.0 420.0 37800 1.7924 10584384
0.0 422.2222 38000 1.7995 10640496
0.0 424.4444 38200 1.8110 10696528
0.0 426.6667 38400 1.7964 10752640
0.0 428.8889 38600 1.8125 10808672
0.0 431.1111 38800 1.7951 10864512
0.0 433.3333 39000 1.8056 10920608
0.0 435.5556 39200 1.7908 10976624
0.0 437.7778 39400 1.8039 11032608
0.0 440.0 39600 1.7930 11088720
0.0 442.2222 39800 1.7873 11144688
0.0 444.4444 40000 1.7878 11200800

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_1745950322

Adapter
(95)
this model