train_wic_1745950286

This model is a fine-tuned version of google/gemma-3-1b-it on the wic dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1829
  • Num Input Tokens Seen: 13031928

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2405 0.1637 200 0.2272 65024
0.2444 0.3275 400 0.2076 129984
0.1836 0.4912 600 0.2093 195024
0.2194 0.6549 800 0.2151 260624
0.1659 0.8187 1000 0.1949 325984
0.2497 0.9824 1200 0.1855 391280
0.1592 1.1457 1400 0.1898 456248
0.1656 1.3095 1600 0.1832 521464
0.1828 1.4732 1800 0.1972 586632
0.1453 1.6369 2000 0.1829 651384
0.1854 1.8007 2200 0.2061 716552
0.1525 1.9644 2400 0.1865 781992
0.2463 2.1277 2600 0.2462 847136
0.1961 2.2914 2800 0.2328 912064
0.1215 2.4552 3000 0.2152 977312
0.1197 2.6189 3200 0.2260 1042608
0.1144 2.7826 3400 0.2762 1107488
0.1973 2.9464 3600 0.2198 1172864
0.1349 3.1097 3800 0.2629 1238392
0.0858 3.2734 4000 0.2550 1303640
0.0194 3.4372 4200 0.2982 1368504
0.1123 3.6009 4400 0.2996 1433480
0.0105 3.7646 4600 0.2793 1499016
0.1267 3.9284 4800 0.2464 1563880
0.0014 4.0917 5000 0.3845 1628808
0.0025 4.2554 5200 0.4421 1693576
0.1797 4.4192 5400 0.3311 1758536
0.0828 4.5829 5600 0.3588 1823544
0.0808 4.7466 5800 0.3956 1889272
0.0063 4.9104 6000 0.3799 1954632
0.0075 5.0737 6200 0.4368 2019440
0.0012 5.2374 6400 0.4723 2084816
0.001 5.4011 6600 0.4949 2149632
0.1747 5.5649 6800 0.5430 2214864
0.0639 5.7286 7000 0.4384 2280368
0.0637 5.8923 7200 0.4027 2345632
0.063 6.0557 7400 0.4501 2410768
0.0001 6.2194 7600 0.6180 2476096
0.1011 6.3831 7800 0.5613 2541152
0.001 6.5469 8000 0.6509 2606016
0.0003 6.7106 8200 0.6392 2670896
0.0001 6.8743 8400 0.5872 2736160
0.1022 7.0377 8600 0.6749 2801120
0.0003 7.2014 8800 0.5979 2865872
0.0001 7.3651 9000 0.7463 2931072
0.063 7.5289 9200 0.6581 2996288
0.0005 7.6926 9400 0.5815 3061744
0.1314 7.8563 9600 0.7341 3126896
0.0 8.0196 9800 0.7032 3191832
0.003 8.1834 10000 0.7525 3257640
0.1229 8.3471 10200 0.7844 3322584
0.0982 8.5108 10400 0.7341 3387672
0.0002 8.6746 10600 0.6536 3452968
0.0002 8.8383 10800 0.6903 3518104
0.0 9.0016 11000 0.7062 3583216
0.0 9.1654 11200 0.6840 3648592
0.0021 9.3291 11400 0.6111 3713808
0.0008 9.4928 11600 0.7421 3778848
0.0038 9.6566 11800 0.7140 3844208
0.019 9.8203 12000 0.7294 3909264
0.0 9.9840 12200 0.7916 3974224
0.0013 10.1474 12400 0.7157 4039488
0.0 10.3111 12600 0.6758 4104512
0.0 10.4748 12800 0.8122 4169856
0.0013 10.6386 13000 0.9845 4234864
0.0 10.8023 13200 0.7000 4300144
0.0 10.9660 13400 0.7296 4365440
0.0017 11.1293 13600 0.8176 4430440
0.0 11.2931 13800 0.7893 4495784
0.0 11.4568 14000 0.7570 4560792
0.0 11.6205 14200 0.8518 4625720
0.0 11.7843 14400 1.0076 4690744
0.0005 11.9480 14600 0.8348 4756152
0.0 12.1113 14800 0.7687 4821256
0.0 12.2751 15000 1.0453 4886344
0.0001 12.4388 15200 0.6358 4951960
0.0001 12.6025 15400 0.7772 5016856
0.0 12.7663 15600 0.8290 5082248
0.0 12.9300 15800 0.7716 5147240
0.0 13.0933 16000 0.7830 5212440
0.0 13.2571 16200 0.7539 5277800
0.0 13.4208 16400 0.8583 5342760
0.0 13.5845 16600 0.7801 5407816
0.0001 13.7483 16800 0.7863 5473672
0.0 13.9120 17000 0.8642 5538456
0.0 14.0753 17200 0.9749 5603152
0.0 14.2391 17400 0.8830 5668048
0.0001 14.4028 17600 0.7809 5732816
0.0 14.5665 17800 0.8874 5798240
0.0 14.7302 18000 0.9713 5863936
0.0 14.8940 18200 0.8705 5929216
0.0 15.0573 18400 0.7759 5994376
0.1167 15.2210 18600 0.7499 6059464
0.0001 15.3848 18800 0.6859 6125240
0.0 15.5485 19000 0.8041 6190600
0.0 15.7122 19200 0.8098 6255240
0.0 15.8760 19400 0.8325 6320328
0.0 16.0393 19600 0.7892 6385240
0.0 16.2030 19800 0.9405 6450424
0.0 16.3668 20000 0.8678 6515688
0.0 16.5305 20200 0.7906 6580712
0.0001 16.6942 20400 0.9038 6646184
0.0 16.8580 20600 1.0150 6711480
0.0 17.0213 20800 0.7679 6776176
0.0 17.1850 21000 0.8505 6841120
0.0 17.3488 21200 0.9167 6906528
0.0 17.5125 21400 0.9113 6971568
0.0105 17.6762 21600 0.8083 7036832
0.0 17.8400 21800 0.8904 7102176
0.0 18.0033 22000 0.9389 7167168
0.0001 18.1670 22200 0.9043 7232736
0.0 18.3307 22400 1.0096 7297984
0.0 18.4945 22600 0.9893 7362832
0.0 18.6582 22800 0.8978 7428672
0.0 18.8219 23000 1.0489 7493504
0.0 18.9857 23200 1.0883 7558400
0.0008 19.1490 23400 0.8473 7623392
0.0 19.3127 23600 0.8169 7688624
0.0 19.4765 23800 0.9587 7753632
0.0 19.6402 24000 1.0419 7819136
0.0 19.8039 24200 1.0699 7884272
0.0 19.9677 24400 1.0770 7949504
0.0 20.1310 24600 1.1074 8014544
0.0 20.2947 24800 1.1210 8079920
0.0 20.4585 25000 1.1904 8145552
0.0 20.6222 25200 1.1880 8210688
0.0 20.7859 25400 1.2028 8275760
0.0 20.9497 25600 1.2060 8340784
0.0 21.1130 25800 1.2094 8405688
0.0 21.2767 26000 1.2216 8470664
0.0 21.4404 26200 1.2286 8535736
0.0 21.6042 26400 0.9804 8600728
0.0 21.7679 26600 1.0341 8666296
0.0 21.9316 26800 1.0696 8731640
0.0 22.0950 27000 1.0796 8796704
0.0 22.2587 27200 1.0898 8861792
0.0 22.4224 27400 1.1136 8927168
0.0 22.5862 27600 1.1366 8992240
0.0 22.7499 27800 1.1678 9057600
0.0 22.9136 28000 1.1880 9122992
0.0 23.0770 28200 1.1958 9187992
0.0 23.2407 28400 1.2037 9253112
0.0 23.4044 28600 1.2104 9318440
0.0 23.5682 28800 1.2259 9383656
0.0 23.7319 29000 1.2314 9448616
0.0 23.8956 29200 1.2489 9513976
0.0 24.0589 29400 1.2493 9579416
0.0 24.2227 29600 1.2538 9644664
0.0 24.3864 29800 1.2654 9710056
0.0 24.5501 30000 1.2774 9775272
0.0 24.7139 30200 1.2813 9840600
0.0 24.8776 30400 1.2899 9905368
0.0 25.0409 30600 1.2958 9970160
0.0 25.2047 30800 1.3080 10035200
0.0 25.3684 31000 1.3169 10100368
0.0 25.5321 31200 1.3205 10165552
0.0 25.6959 31400 1.3325 10230992
0.0 25.8596 31600 1.3377 10295840
0.0 26.0229 31800 1.3551 10360952
0.0 26.1867 32000 1.3486 10425832
0.0 26.3504 32200 1.3492 10490904
0.0 26.5141 32400 1.3634 10556056
0.0 26.6779 32600 1.3686 10621432
0.0 26.8416 32800 1.3747 10686808
0.0 27.0049 33000 1.3855 10751912
0.0 27.1686 33200 1.3891 10817272
0.0 27.3324 33400 1.3899 10882568
0.0 27.4961 33600 1.5643 10947368
0.0 27.6598 33800 1.5987 11012568
0.0 27.8236 34000 1.5874 11078056
0.0 27.9873 34200 1.5974 11143272
0.0 28.1506 34400 1.6002 11208128
0.0 28.3144 34600 1.5996 11273344
0.0 28.4781 34800 1.5725 11338704
0.0 28.6418 35000 1.5683 11404240
0.0 28.8056 35200 1.5776 11469056
0.0 28.9693 35400 1.5774 11534288
0.0 29.1326 35600 1.5750 11599248
0.0 29.2964 35800 1.5784 11664528
0.0 29.4601 36000 1.5636 11729904
0.0 29.6238 36200 1.5759 11794928
0.0 29.7876 36400 1.5668 11860400
0.0 29.9513 36600 1.5679 11925328
0.0 30.1146 36800 1.5811 11989944
0.0 30.2783 37000 1.5715 12054968
0.0 30.4421 37200 1.5768 12120184
0.0 30.6058 37400 1.5689 12185832
0.0 30.7695 37600 1.5570 12250664
0.0 30.9333 37800 1.5749 12315704
0.0 31.0966 38000 1.5868 12380824
0.0 31.2603 38200 1.5731 12446424
0.0 31.4241 38400 1.5788 12511800
0.0 31.5878 38600 1.5794 12576920
0.0 31.7515 38800 1.5718 12641896
0.0 31.9153 39000 1.5873 12706504
0.0 32.0786 39200 1.5778 12771208
0.0 32.2423 39400 1.5870 12836760
0.0 32.4061 39600 1.5795 12901944
0.0 32.5698 39800 1.5692 12967000
0.0 32.7335 40000 1.5683 13031928

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_wic_1745950286

Adapter
(95)
this model