train_copa_1745950321

This model is a fine-tuned version of google/gemma-3-1b-it on the copa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1723
  • Num Input Tokens Seen: 11200800

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5214 2.2222 200 0.5374 56176
0.4327 4.4444 400 0.3699 112064
0.1761 6.6667 600 0.2553 168112
0.1897 8.8889 800 0.2139 224048
0.1566 11.1111 1000 0.2041 279904
0.1435 13.3333 1200 0.1869 336032
0.1579 15.5556 1400 0.1808 391904
0.1658 17.7778 1600 0.1790 448112
0.1728 20.0 1800 0.1773 503920
0.1878 22.2222 2000 0.1756 560016
0.1598 24.4444 2200 0.1741 615952
0.1208 26.6667 2400 0.1765 672080
0.1784 28.8889 2600 0.1723 728000
0.0887 31.1111 2800 0.1766 783872
0.1471 33.3333 3000 0.1777 839808
0.1169 35.5556 3200 0.1777 896064
0.1733 37.7778 3400 0.1859 951888
0.1145 40.0 3600 0.1812 1007760
0.1049 42.2222 3800 0.1895 1063648
0.112 44.4444 4000 0.1951 1119744
0.118 46.6667 4200 0.2037 1175680
0.0896 48.8889 4400 0.2054 1231696
0.0872 51.1111 4600 0.2163 1287744
0.0962 53.3333 4800 0.2126 1343760
0.0264 55.5556 5000 0.2334 1399856
0.0622 57.7778 5200 0.2313 1455808
0.0771 60.0 5400 0.2415 1511856
0.0546 62.2222 5600 0.2533 1567808
0.0505 64.4444 5800 0.2643 1623744
0.064 66.6667 6000 0.2710 1679888
0.0508 68.8889 6200 0.2814 1735952
0.0195 71.1111 6400 0.2748 1791904
0.1238 73.3333 6600 0.3053 1847888
0.0565 75.5556 6800 0.2764 1904000
0.0486 77.7778 7000 0.2620 1959920
0.0104 80.0 7200 0.2914 2015792
0.0327 82.2222 7400 0.2978 2071808
0.1221 84.4444 7600 0.3490 2127808
0.0092 86.6667 7800 0.2913 2183888
0.0219 88.8889 8000 0.2781 2239840
0.0414 91.1111 8200 0.3033 2295888
0.0661 93.3333 8400 0.3191 2351872
0.0074 95.5556 8600 0.3363 2407824
0.1671 97.7778 8800 0.2990 2463744
0.0065 100.0 9000 0.3083 2519680
0.0295 102.2222 9200 0.3234 2575584
0.1017 104.4444 9400 0.3196 2631680
0.0042 106.6667 9600 0.3666 2687728
0.0021 108.8889 9800 0.2994 2743792
0.0393 111.1111 10000 0.3243 2799840
0.0015 113.3333 10200 0.3416 2855808
0.0477 115.5556 10400 0.3560 2911648
0.0404 117.7778 10600 0.3516 2967856
0.0018 120.0 10800 0.3122 3023792
0.002 122.2222 11000 0.3434 3079920
0.0019 124.4444 11200 0.3747 3135904
0.0306 126.6667 11400 0.4096 3191808
0.0324 128.8889 11600 0.3991 3247840
0.028 131.1111 11800 0.3552 3303712
0.0101 133.3333 12000 0.4193 3359680
0.0032 135.5556 12200 0.4205 3415824
0.0037 137.7778 12400 0.4430 3471520
0.0333 140.0 12600 0.3999 3527664
0.0068 142.2222 12800 0.4266 3583696
0.0043 144.4444 13000 0.4580 3639680
0.0195 146.6667 13200 0.5149 3695712
0.0005 148.8889 13400 0.4669 3751728
0.0008 151.1111 13600 0.4736 3807744
0.0017 153.3333 13800 0.4851 3863664
0.0016 155.5556 14000 0.4695 3919584
0.0013 157.7778 14200 0.5366 3975568
0.0003 160.0 14400 0.5121 4031632
0.0016 162.2222 14600 0.5168 4087632
0.0017 164.4444 14800 0.5455 4143664
0.0003 166.6667 15000 0.4630 4199552
0.0015 168.8889 15200 0.5378 4255584
0.0001 171.1111 15400 0.5205 4311504
0.0002 173.3333 15600 0.5626 4367408
0.0041 175.5556 15800 0.6111 4423376
0.0007 177.7778 16000 0.5544 4479456
0.0005 180.0 16200 0.5789 4535504
0.0003 182.2222 16400 0.5728 4591504
0.0001 184.4444 16600 0.5941 4647424
0.0004 186.6667 16800 0.6518 4703376
0.0003 188.8889 17000 0.6518 4759552
0.0002 191.1111 17200 0.6297 4815552
0.0001 193.3333 17400 0.6060 4871600
0.0001 195.5556 17600 0.6693 4927696
0.0001 197.7778 17800 0.6281 4983424
0.0002 200.0 18000 0.6264 5039536
0.0001 202.2222 18200 0.6403 5095376
0.0001 204.4444 18400 0.6769 5151440
0.0001 206.6667 18600 0.6297 5207488
0.0001 208.8889 18800 0.6870 5263360
0.0001 211.1111 19000 0.6699 5319344
0.0001 213.3333 19200 0.6956 5375280
0.0002 215.5556 19400 0.7283 5431520
0.0 217.7778 19600 0.7154 5487472
0.0 220.0 19800 0.7333 5543504
0.0 222.2222 20000 0.7466 5599440
0.0001 224.4444 20200 0.6737 5655424
0.0 226.6667 20400 0.7212 5711344
0.0 228.8889 20600 0.7255 5767376
0.0001 231.1111 20800 0.7555 5823264
0.0 233.3333 21000 0.7266 5879248
0.0 235.5556 21200 0.7732 5935168
0.0001 237.7778 21400 0.7358 5991232
0.0 240.0 21600 0.8029 6047376
0.0 242.2222 21800 0.7473 6103328
0.0 244.4444 22000 0.7876 6159376
0.0 246.6667 22200 0.8027 6215360
0.0 248.8889 22400 0.7958 6271232
0.0 251.1111 22600 0.7958 6327136
0.0 253.3333 22800 0.8212 6383248
0.0 255.5556 23000 0.7978 6439168
0.0 257.7778 23200 0.7724 6495280
0.0 260.0 23400 0.8361 6551264
0.0 262.2222 23600 0.7747 6607424
0.0 264.4444 23800 0.8496 6663168
0.0 266.6667 24000 0.8076 6719216
0.0 268.8889 24200 0.8437 6775344
0.0 271.1111 24400 0.8419 6831344
0.0 273.3333 24600 0.8430 6887344
0.0 275.5556 24800 0.8464 6943632
0.0 277.7778 25000 0.8017 6999632
0.0 280.0 25200 0.8742 7055664
0.0 282.2222 25400 0.8131 7111664
0.0 284.4444 25600 0.8320 7167744
0.0 286.6667 25800 0.8106 7223696
0.0 288.8889 26000 0.8208 7279760
0.0 291.1111 26200 0.8450 7335792
0.0 293.3333 26400 0.8504 7391808
0.0 295.5556 26600 0.8602 7447808
0.0 297.7778 26800 0.8606 7503824
0.0 300.0 27000 0.8233 7559856
0.0 302.2222 27200 0.8203 7615904
0.0 304.4444 27400 0.8417 7672000
0.0 306.6667 27600 0.8962 7727808
0.0 308.8889 27800 0.8647 7783744
0.0 311.1111 28000 0.8644 7839808
0.0 313.3333 28200 0.8796 7895872
0.0 315.5556 28400 0.8325 7951664
0.0 317.7778 28600 0.8311 8007744
0.0 320.0 28800 0.8027 8063616
0.0 322.2222 29000 0.8451 8119520
0.0 324.4444 29200 0.9005 8175584
0.0 326.6667 29400 0.8589 8231760
0.0 328.8889 29600 0.8994 8287696
0.0 331.1111 29800 0.8628 8343760
0.0 333.3333 30000 0.8770 8399696
0.0 335.5556 30200 0.8600 8455776
0.0 337.7778 30400 0.8756 8511760
0.0 340.0 30600 0.8546 8567792
0.0 342.2222 30800 0.8787 8623728
0.0 344.4444 31000 0.9041 8679920
0.0 346.6667 31200 0.8960 8736032
0.0 348.8889 31400 0.8573 8791888
0.0 351.1111 31600 0.8749 8847728
0.0 353.3333 31800 0.9053 8903952
0.0 355.5556 32000 0.8862 8959920
0.0 357.7778 32200 0.8689 9016096
0.0 360.0 32400 0.8738 9072192
0.0 362.2222 32600 0.8964 9128272
0.0 364.4444 32800 0.8606 9184240
0.0 366.6667 33000 0.8734 9240064
0.0 368.8889 33200 0.9237 9295952
0.0 371.1111 33400 0.8652 9352016
0.0 373.3333 33600 0.9286 9407968
0.0 375.5556 33800 0.9074 9463920
0.0 377.7778 34000 0.8856 9519984
0.0 380.0 34200 0.8791 9575936
0.0 382.2222 34400 0.8856 9631952
0.0 384.4444 34600 0.9035 9687936
0.0 386.6667 34800 0.8925 9743968
0.0 388.8889 35000 0.8827 9800016
0.0 391.1111 35200 0.9023 9856016
0.0 393.3333 35400 0.9180 9912112
0.0 395.5556 35600 0.8993 9968112
0.0 397.7778 35800 0.8826 10024160
0.0 400.0 36000 0.9163 10080240
0.0 402.2222 36200 0.9174 10136208
0.0 404.4444 36400 0.8985 10192208
0.0 406.6667 36600 0.9004 10248192
0.0 408.8889 36800 0.9409 10304144
0.0 411.1111 37000 0.9265 10360192
0.0 413.3333 37200 0.8856 10416288
0.0 415.5556 37400 0.8770 10472368
0.0 417.7778 37600 0.8981 10528352
0.0 420.0 37800 0.9223 10584384
0.0 422.2222 38000 0.9099 10640496
0.0 424.4444 38200 0.9754 10696528
0.0 426.6667 38400 0.9143 10752640
0.0 428.8889 38600 0.8929 10808672
0.0 431.1111 38800 0.9007 10864512
0.0 433.3333 39000 0.9196 10920608
0.0 435.5556 39200 0.9077 10976624
0.0 437.7778 39400 0.9077 11032608
0.0 440.0 39600 0.9077 11088720
0.0 442.2222 39800 0.9077 11144688
0.0 444.4444 40000 0.9077 11200800

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_1745950321

Adapter
(95)
this model