train_stsb_1745333587

This model is a fine-tuned version of google/gemma-3-1b-it on the stsb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3016
  • Num Input Tokens Seen: 61089232

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.3
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6338 0.6182 200 0.7335 305312
0.4361 1.2349 400 0.5119 610048
0.4069 1.8532 600 0.4508 917664
0.3162 2.4699 800 0.4030 1223104
0.3616 3.0866 1000 0.4255 1528432
0.536 3.7048 1200 0.4218 1837520
0.3726 4.3215 1400 0.3733 2143216
0.3062 4.9397 1600 0.3706 2448176
0.2859 5.5564 1800 0.3829 2752768
0.3256 6.1731 2000 0.3627 3059504
0.3641 6.7913 2200 0.4599 3364688
0.4271 7.4080 2400 0.4038 3672432
0.2832 8.0247 2600 0.3646 3978272
0.3205 8.6430 2800 0.3523 4285856
0.329 9.2597 3000 0.3370 4588608
0.2854 9.8779 3200 0.3341 4894432
0.3865 10.4946 3400 0.3601 5200528
0.2422 11.1113 3600 0.3311 5504960
0.2637 11.7295 3800 0.3185 5808800
0.2363 12.3462 4000 0.3221 6114608
0.3095 12.9645 4200 0.3179 6419376
0.2563 13.5811 4400 0.3217 6725664
0.2999 14.1978 4600 0.3307 7030208
0.3607 14.8161 4800 0.3180 7335712
0.221 15.4328 5000 0.3062 7641232
0.2389 16.0495 5200 0.3139 7945360
0.2197 16.6677 5400 0.3232 8252048
0.2517 17.2844 5600 0.3066 8557024
0.2513 17.9026 5800 0.3103 8862080
0.2479 18.5193 6000 0.3411 9167248
0.2169 19.1360 6200 0.3285 9472816
0.2815 19.7543 6400 0.3114 9779344
0.2464 20.3709 6600 0.3254 10085888
0.2408 20.9892 6800 0.3308 10391904
0.2412 21.6059 7000 0.3197 10697664
0.2595 22.2226 7200 0.3335 11000832
0.2116 22.8408 7400 0.3016 11308384
0.2089 23.4575 7600 0.3092 11614048
0.2442 24.0742 7800 0.3080 11917328
0.2222 24.6924 8000 0.3133 12224848
0.2086 25.3091 8200 0.3171 12530128
0.2304 25.9274 8400 0.3075 12838032
0.2304 26.5440 8600 0.3286 13142096
0.2231 27.1607 8800 0.3456 13447712
0.2363 27.7790 9000 0.3058 13751968
0.1782 28.3957 9200 0.3350 14060176
0.2115 29.0124 9400 0.3257 14362928
0.2082 29.6306 9600 0.3248 14669168
0.2693 30.2473 9800 0.3120 14973568
0.2111 30.8655 10000 0.3342 15279840
0.2595 31.4822 10200 0.3324 15586352
0.2011 32.0989 10400 0.3340 15891232
0.2554 32.7172 10600 0.3228 16197472
0.1682 33.3338 10800 0.3253 16500992
0.1949 33.9521 11000 0.3222 16807808
0.1776 34.5688 11200 0.3391 17112928
0.1758 35.1855 11400 0.3517 17420016
0.2342 35.8037 11600 0.3492 17726608
0.1868 36.4204 11800 0.3338 18030288
0.1965 37.0371 12000 0.3418 18337584
0.2001 37.6553 12200 0.3389 18640720
0.1705 38.2720 12400 0.3572 18946400
0.2201 38.8903 12600 0.3563 19254240
0.1896 39.5070 12800 0.3264 19558592
0.154 40.1236 13000 0.3620 19861168
0.1773 40.7419 13200 0.3859 20169712
0.1576 41.3586 13400 0.3768 20475008
0.1768 41.9768 13600 0.3605 20782016
0.1917 42.5935 13800 0.3671 21085440
0.1811 43.2102 14000 0.3727 21391616
0.1895 43.8284 14200 0.3919 21696768
0.169 44.4451 14400 0.4041 22001488
0.1516 45.0618 14600 0.3888 22307216
0.1491 45.6801 14800 0.4216 22612016
0.1673 46.2968 15000 0.4129 22917744
0.1478 46.9150 15200 0.4313 23224720
0.1482 47.5317 15400 0.4394 23531040
0.1393 48.1484 15600 0.4659 23836048
0.1844 48.7666 15800 0.4073 24140240
0.1588 49.3833 16000 0.4706 24445104
0.1329 50.0 16200 0.4183 24750256
0.1453 50.6182 16400 0.4446 25055056
0.1461 51.2349 16600 0.4712 25360976
0.1556 51.8532 16800 0.4585 25669136
0.1392 52.4699 17000 0.4510 25972400
0.1256 53.0866 17200 0.5056 26280272
0.1481 53.7048 17400 0.4870 26583184
0.1249 54.3215 17600 0.4830 26891152
0.1095 54.9397 17800 0.4971 27197008
0.1197 55.5564 18000 0.5030 27500160
0.1382 56.1731 18200 0.5288 27805616
0.1393 56.7913 18400 0.5456 28112336
0.1149 57.4080 18600 0.5278 28419888
0.1033 58.0247 18800 0.5350 28724096
0.1095 58.6430 19000 0.5425 29031328
0.1111 59.2597 19200 0.5870 29336560
0.1431 59.8779 19400 0.5120 29642224
0.1157 60.4946 19600 0.5720 29947456
0.1251 61.1113 19800 0.5996 30252288
0.1074 61.7295 20000 0.5862 30557408
0.1074 62.3462 20200 0.5969 30862656
0.1173 62.9645 20400 0.5943 31169472
0.0826 63.5811 20600 0.6368 31474928
0.097 64.1978 20800 0.6193 31778496
0.1155 64.8161 21000 0.6350 32086304
0.0968 65.4328 21200 0.6808 32389328
0.0915 66.0495 21400 0.6950 32696656
0.0858 66.6677 21600 0.6440 33001008
0.073 67.2844 21800 0.6853 33306288
0.0952 67.9026 22000 0.6575 33612592
0.1051 68.5193 22200 0.6931 33914992
0.0765 69.1360 22400 0.7210 34219808
0.0672 69.7543 22600 0.7424 34525536
0.0867 70.3709 22800 0.7057 34829856
0.0617 70.9892 23000 0.7523 35134560
0.0674 71.6059 23200 0.7587 35439168
0.095 72.2226 23400 0.7627 35744608
0.0599 72.8408 23600 0.7685 36050688
0.0686 73.4575 23800 0.7728 36353808
0.0658 74.0742 24000 0.8231 36660560
0.0772 74.6924 24200 0.8133 36968464
0.0546 75.3091 24400 0.8044 37273264
0.0618 75.9274 24600 0.7944 37578896
0.0479 76.5440 24800 0.8679 37882832
0.0439 77.1607 25000 0.8588 38187312
0.0516 77.7790 25200 0.8604 38492720
0.0392 78.3957 25400 0.9089 38796864
0.0465 79.0124 25600 0.9304 39103824
0.0358 79.6306 25800 0.9374 39410448
0.0374 80.2473 26000 0.9326 39715280
0.0326 80.8655 26200 0.9603 40021520
0.0508 81.4822 26400 0.9404 40325376
0.0431 82.0989 26600 0.9616 40631296
0.0345 82.7172 26800 0.9698 40937312
0.0342 83.3338 27000 1.0021 41240464
0.0292 83.9521 27200 0.9857 41550128
0.0239 84.5688 27400 1.0404 41855152
0.0195 85.1855 27600 1.0455 42158912
0.0268 85.8037 27800 1.0351 42461856
0.027 86.4204 28000 1.0425 42769760
0.0157 87.0371 28200 1.0615 43074800
0.0328 87.6553 28400 1.0537 43378640
0.015 88.2720 28600 1.0599 43683840
0.0171 88.8903 28800 1.1204 43988256
0.0144 89.5070 29000 1.1128 44294256
0.0139 90.1236 29200 1.1181 44598464
0.019 90.7419 29400 1.1521 44904928
0.0091 91.3586 29600 1.1771 45208784
0.0167 91.9768 29800 1.1665 45516336
0.0069 92.5935 30000 1.1925 45820432
0.0067 93.2102 30200 1.1542 46127408
0.0105 93.8284 30400 1.1779 46431888
0.0161 94.4451 30600 1.2037 46736368
0.01 95.0618 30800 1.2178 47043472
0.0053 95.6801 31000 1.2279 47348976
0.0045 96.2968 31200 1.2609 47652864
0.0054 96.9150 31400 1.2644 47959872
0.0071 97.5317 31600 1.2636 48265392
0.0031 98.1484 31800 1.2654 48569984
0.006 98.7666 32000 1.2886 48874016
0.0078 99.3833 32200 1.2597 49181056
0.0028 100.0 32400 1.2977 49485120
0.0038 100.6182 32600 1.2805 49790304
0.0031 101.2349 32800 1.2831 50097008
0.0034 101.8532 33000 1.3035 50403088
0.0029 102.4699 33200 1.3286 50707088
0.0022 103.0866 33400 1.3272 51010144
0.0015 103.7048 33600 1.3404 51318976
0.0049 104.3215 33800 1.3586 51623136
0.0018 104.9397 34000 1.3730 51930240
0.0014 105.5564 34200 1.3785 52233888
0.0022 106.1731 34400 1.3781 52541008
0.0038 106.7913 34600 1.3860 52845904
0.0014 107.4080 34800 1.3796 53150720
0.0017 108.0247 35000 1.3928 53456816
0.0018 108.6430 35200 1.3678 53760816
0.0015 109.2597 35400 1.3936 54066160
0.0021 109.8779 35600 1.3884 54371888
0.001 110.4946 35800 1.3918 54676672
0.0012 111.1113 36000 1.4058 54983008
0.0007 111.7295 36200 1.4167 55289472
0.0008 112.3462 36400 1.4215 55591632
0.001 112.9645 36600 1.4282 55898640
0.0009 113.5811 36800 1.4258 56202288
0.0012 114.1978 37000 1.4355 56510384
0.001 114.8161 37200 1.4351 56816880
0.0013 115.4328 37400 1.4372 57119232
0.0012 116.0495 37600 1.4366 57424224
0.0006 116.6677 37800 1.4440 57729856
0.0007 117.2844 38000 1.4441 58034352
0.0005 117.9026 38200 1.4475 58342576
0.0007 118.5193 38400 1.4481 58648384
0.0007 119.1360 38600 1.4491 58953568
0.0009 119.7543 38800 1.4523 59257088
0.0007 120.3709 39000 1.4505 59562208
0.0005 120.9892 39200 1.4553 59867712
0.0011 121.6059 39400 1.4503 60173616
0.0008 122.2226 39600 1.4521 60476592
0.0005 122.8408 39800 1.4525 60782960
0.0006 123.4575 40000 1.4483 61089232

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_stsb_1745333587

Adapter
(112)
this model