collapse_gemma-2-2b_hs2_accumulate_iter17_sftsd2
This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.1063
- Num Input Tokens Seen: 88053512
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 8e-06
- train_batch_size: 8
- eval_batch_size: 16
- seed: 2
- gradient_accumulation_steps: 16
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant_with_warmup
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss | Input Tokens Seen |
---|---|---|---|---|
No log | 0 | 0 | 1.3909 | 0 |
1.5736 | 0.0031 | 5 | 1.3904 | 266064 |
1.5849 | 0.0061 | 10 | 1.3824 | 532608 |
1.6237 | 0.0092 | 15 | 1.3568 | 800648 |
1.4925 | 0.0122 | 20 | 1.3233 | 1069960 |
1.425 | 0.0153 | 25 | 1.2772 | 1340336 |
1.3375 | 0.0184 | 30 | 1.2428 | 1618328 |
1.3168 | 0.0214 | 35 | 1.2158 | 1883528 |
1.1821 | 0.0245 | 40 | 1.1949 | 2156880 |
1.028 | 0.0276 | 45 | 1.2067 | 2426616 |
0.9191 | 0.0306 | 50 | 1.2329 | 2696936 |
0.8795 | 0.0337 | 55 | 1.2559 | 2959856 |
0.7068 | 0.0367 | 60 | 1.2837 | 3235920 |
0.6748 | 0.0398 | 65 | 1.2833 | 3505968 |
0.6779 | 0.0429 | 70 | 1.3042 | 3774728 |
0.5956 | 0.0459 | 75 | 1.2925 | 4040064 |
0.4015 | 0.0490 | 80 | 1.2852 | 4313048 |
0.4049 | 0.0520 | 85 | 1.2852 | 4578408 |
0.3789 | 0.0551 | 90 | 1.2529 | 4845240 |
0.3437 | 0.0582 | 95 | 1.2403 | 5110608 |
0.3502 | 0.0612 | 100 | 1.2330 | 5377008 |
0.349 | 0.0643 | 105 | 1.2313 | 5637656 |
0.2411 | 0.0674 | 110 | 1.2213 | 5910688 |
0.1947 | 0.0704 | 115 | 1.2223 | 6182624 |
0.1821 | 0.0735 | 120 | 1.2222 | 6450752 |
0.2998 | 0.0765 | 125 | 1.2122 | 6724528 |
0.1898 | 0.0796 | 130 | 1.2130 | 6997424 |
0.2205 | 0.0827 | 135 | 1.2191 | 7263352 |
0.2161 | 0.0857 | 140 | 1.2045 | 7529808 |
0.1721 | 0.0888 | 145 | 1.2103 | 7794424 |
0.2152 | 0.0918 | 150 | 1.2082 | 8074144 |
0.2182 | 0.0949 | 155 | 1.2075 | 8343952 |
0.1606 | 0.0980 | 160 | 1.1954 | 8613200 |
0.1279 | 0.1010 | 165 | 1.1955 | 8882128 |
0.1645 | 0.1041 | 170 | 1.2089 | 9151896 |
0.1806 | 0.1072 | 175 | 1.1875 | 9427696 |
0.1814 | 0.1102 | 180 | 1.1918 | 9698304 |
0.1758 | 0.1133 | 185 | 1.2009 | 9964584 |
0.1425 | 0.1163 | 190 | 1.1911 | 10233232 |
0.1517 | 0.1194 | 195 | 1.2067 | 10496416 |
0.207 | 0.1225 | 200 | 1.1928 | 10770528 |
0.1728 | 0.1255 | 205 | 1.1905 | 11050568 |
0.1739 | 0.1286 | 210 | 1.1854 | 11324672 |
0.1237 | 0.1316 | 215 | 1.1871 | 11602256 |
0.1409 | 0.1347 | 220 | 1.1852 | 11866248 |
0.1863 | 0.1378 | 225 | 1.1905 | 12128984 |
0.1872 | 0.1408 | 230 | 1.1880 | 12399312 |
0.1597 | 0.1439 | 235 | 1.1868 | 12662104 |
0.2231 | 0.1470 | 240 | 1.1880 | 12934456 |
0.1812 | 0.1500 | 245 | 1.1837 | 13201872 |
0.1872 | 0.1531 | 250 | 1.1857 | 13481856 |
0.1576 | 0.1561 | 255 | 1.1873 | 13747496 |
0.1806 | 0.1592 | 260 | 1.1831 | 14021696 |
0.121 | 0.1623 | 265 | 1.1829 | 14289528 |
0.1715 | 0.1653 | 270 | 1.1812 | 14563608 |
0.2055 | 0.1684 | 275 | 1.1806 | 14833880 |
0.2048 | 0.1715 | 280 | 1.1777 | 15101920 |
0.1839 | 0.1745 | 285 | 1.1759 | 15377432 |
0.1574 | 0.1776 | 290 | 1.1708 | 15652720 |
0.1673 | 0.1806 | 295 | 1.1717 | 15919016 |
0.1404 | 0.1837 | 300 | 1.1737 | 16186816 |
0.1095 | 0.1868 | 305 | 1.1718 | 16452392 |
0.1919 | 0.1898 | 310 | 1.1760 | 16723856 |
0.1794 | 0.1929 | 315 | 1.1704 | 16991736 |
0.1247 | 0.1959 | 320 | 1.1710 | 17267600 |
0.244 | 0.1990 | 325 | 1.1711 | 17535944 |
0.0914 | 0.2021 | 330 | 1.1671 | 17807192 |
0.1484 | 0.2051 | 335 | 1.1710 | 18076424 |
0.129 | 0.2082 | 340 | 1.1641 | 18349888 |
0.1811 | 0.2113 | 345 | 1.1714 | 18617448 |
0.1527 | 0.2143 | 350 | 1.1709 | 18883400 |
0.1208 | 0.2174 | 355 | 1.1658 | 19158488 |
0.192 | 0.2204 | 360 | 1.1730 | 19426160 |
0.121 | 0.2235 | 365 | 1.1661 | 19695544 |
0.1556 | 0.2266 | 370 | 1.1665 | 19963536 |
0.1333 | 0.2296 | 375 | 1.1639 | 20233616 |
0.1072 | 0.2327 | 380 | 1.1676 | 20503200 |
0.2181 | 0.2357 | 385 | 1.1661 | 20773704 |
0.1575 | 0.2388 | 390 | 1.1596 | 21047608 |
0.1127 | 0.2419 | 395 | 1.1636 | 21320408 |
0.1005 | 0.2449 | 400 | 1.1620 | 21589280 |
0.0988 | 0.2480 | 405 | 1.1597 | 21853552 |
0.1312 | 0.2511 | 410 | 1.1628 | 22121368 |
0.1634 | 0.2541 | 415 | 1.1626 | 22396712 |
0.1137 | 0.2572 | 420 | 1.1582 | 22663000 |
0.148 | 0.2602 | 425 | 1.1612 | 22928624 |
0.1876 | 0.2633 | 430 | 1.1579 | 23206960 |
0.1502 | 0.2664 | 435 | 1.1544 | 23479400 |
0.2014 | 0.2694 | 440 | 1.1561 | 23751896 |
0.1691 | 0.2725 | 445 | 1.1551 | 24016768 |
0.0988 | 0.2755 | 450 | 1.1560 | 24285968 |
0.1387 | 0.2786 | 455 | 1.1545 | 24553392 |
0.2191 | 0.2817 | 460 | 1.1537 | 24824568 |
0.1282 | 0.2847 | 465 | 1.1518 | 25091536 |
0.1344 | 0.2878 | 470 | 1.1536 | 25358376 |
0.1856 | 0.2909 | 475 | 1.1531 | 25627152 |
0.137 | 0.2939 | 480 | 1.1531 | 25898856 |
0.1176 | 0.2970 | 485 | 1.1519 | 26164632 |
0.1391 | 0.3000 | 490 | 1.1506 | 26436656 |
0.1356 | 0.3031 | 495 | 1.1491 | 26709664 |
0.1991 | 0.3062 | 500 | 1.1493 | 26981576 |
0.1221 | 0.3092 | 505 | 1.1498 | 27251440 |
0.1404 | 0.3123 | 510 | 1.1501 | 27523760 |
0.1588 | 0.3153 | 515 | 1.1536 | 27797712 |
0.1849 | 0.3184 | 520 | 1.1499 | 28073128 |
0.1244 | 0.3215 | 525 | 1.1439 | 28344400 |
0.109 | 0.3245 | 530 | 1.1516 | 28608320 |
0.1513 | 0.3276 | 535 | 1.1481 | 28877536 |
0.1676 | 0.3307 | 540 | 1.1458 | 29142728 |
0.1296 | 0.3337 | 545 | 1.1447 | 29410576 |
0.1777 | 0.3368 | 550 | 1.1472 | 29675760 |
0.133 | 0.3398 | 555 | 1.1427 | 29948576 |
0.1002 | 0.3429 | 560 | 1.1423 | 30218992 |
0.1241 | 0.3460 | 565 | 1.1479 | 30485616 |
0.1538 | 0.3490 | 570 | 1.1420 | 30759352 |
0.1057 | 0.3521 | 575 | 1.1408 | 31028560 |
0.1359 | 0.3551 | 580 | 1.1456 | 31290816 |
0.1654 | 0.3582 | 585 | 1.1452 | 31561008 |
0.1455 | 0.3613 | 590 | 1.1436 | 31832888 |
0.1637 | 0.3643 | 595 | 1.1475 | 32105384 |
0.1249 | 0.3674 | 600 | 1.1424 | 32376720 |
0.127 | 0.3705 | 605 | 1.1401 | 32648096 |
0.1861 | 0.3735 | 610 | 1.1452 | 32910632 |
0.132 | 0.3766 | 615 | 1.1439 | 33181488 |
0.0972 | 0.3796 | 620 | 1.1407 | 33454264 |
0.148 | 0.3827 | 625 | 1.1448 | 33717552 |
0.1048 | 0.3858 | 630 | 1.1418 | 33983712 |
0.1695 | 0.3888 | 635 | 1.1408 | 34255928 |
0.1705 | 0.3919 | 640 | 1.1385 | 34515656 |
0.0887 | 0.3949 | 645 | 1.1417 | 34788368 |
0.1562 | 0.3980 | 650 | 1.1466 | 35057688 |
0.0985 | 0.4011 | 655 | 1.1406 | 35324256 |
0.1099 | 0.4041 | 660 | 1.1391 | 35598888 |
0.0976 | 0.4072 | 665 | 1.1433 | 35868184 |
0.1421 | 0.4103 | 670 | 1.1409 | 36139392 |
0.1126 | 0.4133 | 675 | 1.1385 | 36403328 |
0.1148 | 0.4164 | 680 | 1.1387 | 36675568 |
0.1275 | 0.4194 | 685 | 1.1421 | 36945064 |
0.0943 | 0.4225 | 690 | 1.1391 | 37210400 |
0.207 | 0.4256 | 695 | 1.1366 | 37476840 |
0.1419 | 0.4286 | 700 | 1.1400 | 37745320 |
0.1542 | 0.4317 | 705 | 1.1377 | 38014200 |
0.1006 | 0.4347 | 710 | 1.1363 | 38283800 |
0.1188 | 0.4378 | 715 | 1.1375 | 38548184 |
0.1077 | 0.4409 | 720 | 1.1352 | 38820432 |
0.1449 | 0.4439 | 725 | 1.1358 | 39092008 |
0.1278 | 0.4470 | 730 | 1.1348 | 39357048 |
0.1263 | 0.4501 | 735 | 1.1323 | 39629568 |
0.1057 | 0.4531 | 740 | 1.1331 | 39900304 |
0.0979 | 0.4562 | 745 | 1.1352 | 40166120 |
0.1745 | 0.4592 | 750 | 1.1341 | 40443912 |
0.1632 | 0.4623 | 755 | 1.1352 | 40708248 |
0.1041 | 0.4654 | 760 | 1.1334 | 40980792 |
0.1932 | 0.4684 | 765 | 1.1348 | 41246584 |
0.1163 | 0.4715 | 770 | 1.1365 | 41519584 |
0.1055 | 0.4746 | 775 | 1.1318 | 41791584 |
0.1019 | 0.4776 | 780 | 1.1344 | 42060000 |
0.1509 | 0.4807 | 785 | 1.1323 | 42328704 |
0.1743 | 0.4837 | 790 | 1.1314 | 42592840 |
0.1334 | 0.4868 | 795 | 1.1328 | 42860912 |
0.1169 | 0.4899 | 800 | 1.1318 | 43131864 |
0.0828 | 0.4929 | 805 | 1.1290 | 43401472 |
0.1072 | 0.4960 | 810 | 1.1316 | 43674576 |
0.1898 | 0.4990 | 815 | 1.1307 | 43943344 |
0.113 | 0.5021 | 820 | 1.1280 | 44218136 |
0.1275 | 0.5052 | 825 | 1.1290 | 44480280 |
0.1214 | 0.5082 | 830 | 1.1314 | 44751848 |
0.1486 | 0.5113 | 835 | 1.1311 | 45025224 |
0.1467 | 0.5144 | 840 | 1.1279 | 45292376 |
0.1233 | 0.5174 | 845 | 1.1285 | 45561472 |
0.124 | 0.5205 | 850 | 1.1299 | 45830536 |
0.104 | 0.5235 | 855 | 1.1302 | 46093544 |
0.1301 | 0.5266 | 860 | 1.1323 | 46366936 |
0.1019 | 0.5297 | 865 | 1.1324 | 46632648 |
0.0981 | 0.5327 | 870 | 1.1285 | 46911112 |
0.1265 | 0.5358 | 875 | 1.1246 | 47171136 |
0.072 | 0.5388 | 880 | 1.1272 | 47440688 |
0.1325 | 0.5419 | 885 | 1.1293 | 47715336 |
0.1526 | 0.5450 | 890 | 1.1265 | 47985096 |
0.1172 | 0.5480 | 895 | 1.1273 | 48254832 |
0.1483 | 0.5511 | 900 | 1.1277 | 48521264 |
0.1134 | 0.5542 | 905 | 1.1239 | 48793360 |
0.1113 | 0.5572 | 910 | 1.1251 | 49057008 |
0.1357 | 0.5603 | 915 | 1.1261 | 49323256 |
0.1104 | 0.5633 | 920 | 1.1262 | 49587672 |
0.1824 | 0.5664 | 925 | 1.1248 | 49859912 |
0.1023 | 0.5695 | 930 | 1.1281 | 50125224 |
0.1441 | 0.5725 | 935 | 1.1268 | 50392232 |
0.1461 | 0.5756 | 940 | 1.1228 | 50665800 |
0.1533 | 0.5786 | 945 | 1.1253 | 50941600 |
0.0956 | 0.5817 | 950 | 1.1264 | 51209712 |
0.1664 | 0.5848 | 955 | 1.1240 | 51484784 |
0.1976 | 0.5878 | 960 | 1.1221 | 51758552 |
0.1333 | 0.5909 | 965 | 1.1223 | 52030248 |
0.1567 | 0.5940 | 970 | 1.1238 | 52293128 |
0.0832 | 0.5970 | 975 | 1.1231 | 52564072 |
0.1143 | 0.6001 | 980 | 1.1237 | 52834600 |
0.1325 | 0.6031 | 985 | 1.1229 | 53096984 |
0.1361 | 0.6062 | 990 | 1.1235 | 53369568 |
0.1092 | 0.6093 | 995 | 1.1238 | 53631136 |
0.1493 | 0.6123 | 1000 | 1.1248 | 53899136 |
0.2258 | 0.6154 | 1005 | 1.1247 | 54169264 |
0.1345 | 0.6184 | 1010 | 1.1215 | 54441112 |
0.129 | 0.6215 | 1015 | 1.1211 | 54709824 |
0.1168 | 0.6246 | 1020 | 1.1234 | 54976144 |
0.1313 | 0.6276 | 1025 | 1.1199 | 55244264 |
0.0983 | 0.6307 | 1030 | 1.1214 | 55516472 |
0.1457 | 0.6338 | 1035 | 1.1222 | 55779936 |
0.1473 | 0.6368 | 1040 | 1.1234 | 56044584 |
0.1032 | 0.6399 | 1045 | 1.1217 | 56304456 |
0.1594 | 0.6429 | 1050 | 1.1227 | 56572336 |
0.1376 | 0.6460 | 1055 | 1.1215 | 56847464 |
0.0811 | 0.6491 | 1060 | 1.1210 | 57117584 |
0.1372 | 0.6521 | 1065 | 1.1212 | 57386280 |
0.1731 | 0.6552 | 1070 | 1.1219 | 57650128 |
0.1318 | 0.6582 | 1075 | 1.1194 | 57920680 |
0.1476 | 0.6613 | 1080 | 1.1200 | 58194088 |
0.16 | 0.6644 | 1085 | 1.1195 | 58462936 |
0.1174 | 0.6674 | 1090 | 1.1181 | 58732288 |
0.1688 | 0.6705 | 1095 | 1.1208 | 58998088 |
0.1082 | 0.6736 | 1100 | 1.1183 | 59267664 |
0.1152 | 0.6766 | 1105 | 1.1189 | 59532680 |
0.1105 | 0.6797 | 1110 | 1.1195 | 59803696 |
0.0794 | 0.6827 | 1115 | 1.1185 | 60071384 |
0.14 | 0.6858 | 1120 | 1.1190 | 60343032 |
0.1035 | 0.6889 | 1125 | 1.1191 | 60620088 |
0.1563 | 0.6919 | 1130 | 1.1190 | 60893776 |
0.1307 | 0.6950 | 1135 | 1.1191 | 61163856 |
0.0936 | 0.6980 | 1140 | 1.1157 | 61432408 |
0.1282 | 0.7011 | 1145 | 1.1170 | 61705384 |
0.137 | 0.7042 | 1150 | 1.1170 | 61977984 |
0.0879 | 0.7072 | 1155 | 1.1170 | 62250680 |
0.1453 | 0.7103 | 1160 | 1.1168 | 62521536 |
0.1313 | 0.7134 | 1165 | 1.1168 | 62793472 |
0.1112 | 0.7164 | 1170 | 1.1176 | 63063072 |
0.1304 | 0.7195 | 1175 | 1.1187 | 63339648 |
0.1682 | 0.7225 | 1180 | 1.1150 | 63606808 |
0.1154 | 0.7256 | 1185 | 1.1149 | 63879664 |
0.1618 | 0.7287 | 1190 | 1.1165 | 64158584 |
0.2112 | 0.7317 | 1195 | 1.1161 | 64422560 |
0.2051 | 0.7348 | 1200 | 1.1161 | 64690400 |
0.1652 | 0.7378 | 1205 | 1.1176 | 64968608 |
0.1259 | 0.7409 | 1210 | 1.1168 | 65244408 |
0.1324 | 0.7440 | 1215 | 1.1143 | 65520944 |
0.0722 | 0.7470 | 1220 | 1.1168 | 65786192 |
0.0822 | 0.7501 | 1225 | 1.1169 | 66057064 |
0.1176 | 0.7532 | 1230 | 1.1151 | 66320016 |
0.1064 | 0.7562 | 1235 | 1.1153 | 66592696 |
0.1342 | 0.7593 | 1240 | 1.1174 | 66863744 |
0.1222 | 0.7623 | 1245 | 1.1165 | 67138872 |
0.0627 | 0.7654 | 1250 | 1.1168 | 67403104 |
0.1569 | 0.7685 | 1255 | 1.1185 | 67669760 |
0.1388 | 0.7715 | 1260 | 1.1151 | 67939928 |
0.1188 | 0.7746 | 1265 | 1.1149 | 68205008 |
0.1522 | 0.7777 | 1270 | 1.1165 | 68479072 |
0.082 | 0.7807 | 1275 | 1.1156 | 68751584 |
0.1179 | 0.7838 | 1280 | 1.1158 | 69026136 |
0.1174 | 0.7868 | 1285 | 1.1156 | 69292648 |
0.1224 | 0.7899 | 1290 | 1.1133 | 69565144 |
0.1119 | 0.7930 | 1295 | 1.1123 | 69834504 |
0.1223 | 0.7960 | 1300 | 1.1119 | 70107176 |
0.216 | 0.7991 | 1305 | 1.1143 | 70384992 |
0.1381 | 0.8021 | 1310 | 1.1171 | 70658552 |
0.2221 | 0.8052 | 1315 | 1.1148 | 70924448 |
0.1214 | 0.8083 | 1320 | 1.1125 | 71198664 |
0.1746 | 0.8113 | 1325 | 1.1127 | 71476200 |
0.1646 | 0.8144 | 1330 | 1.1133 | 71746760 |
0.1094 | 0.8175 | 1335 | 1.1103 | 72016648 |
0.0682 | 0.8205 | 1340 | 1.1112 | 72286464 |
0.1137 | 0.8236 | 1345 | 1.1133 | 72552768 |
0.1462 | 0.8266 | 1350 | 1.1122 | 72812368 |
0.0723 | 0.8297 | 1355 | 1.1141 | 73079168 |
0.0908 | 0.8328 | 1360 | 1.1151 | 73342696 |
0.1199 | 0.8358 | 1365 | 1.1131 | 73612808 |
0.1188 | 0.8389 | 1370 | 1.1123 | 73879472 |
0.1843 | 0.8419 | 1375 | 1.1169 | 74153504 |
0.1217 | 0.8450 | 1380 | 1.1157 | 74418904 |
0.1368 | 0.8481 | 1385 | 1.1111 | 74691656 |
0.1335 | 0.8511 | 1390 | 1.1101 | 74958992 |
0.1083 | 0.8542 | 1395 | 1.1112 | 75234416 |
0.1791 | 0.8573 | 1400 | 1.1117 | 75507256 |
0.0991 | 0.8603 | 1405 | 1.1124 | 75776872 |
0.1031 | 0.8634 | 1410 | 1.1126 | 76047184 |
0.0888 | 0.8664 | 1415 | 1.1108 | 76315792 |
0.1317 | 0.8695 | 1420 | 1.1109 | 76584664 |
0.0837 | 0.8726 | 1425 | 1.1109 | 76858648 |
0.0622 | 0.8756 | 1430 | 1.1125 | 77126656 |
0.1478 | 0.8787 | 1435 | 1.1137 | 77397928 |
0.1656 | 0.8817 | 1440 | 1.1118 | 77663288 |
0.0875 | 0.8848 | 1445 | 1.1114 | 77930920 |
0.1311 | 0.8879 | 1450 | 1.1141 | 78196560 |
0.1269 | 0.8909 | 1455 | 1.1114 | 78469544 |
0.143 | 0.8940 | 1460 | 1.1071 | 78734704 |
0.1493 | 0.8971 | 1465 | 1.1099 | 79008024 |
0.0974 | 0.9001 | 1470 | 1.1134 | 79278424 |
0.121 | 0.9032 | 1475 | 1.1103 | 79543032 |
0.1726 | 0.9062 | 1480 | 1.1106 | 79801264 |
0.0754 | 0.9093 | 1485 | 1.1077 | 80066288 |
0.1272 | 0.9124 | 1490 | 1.1081 | 80325544 |
0.1464 | 0.9154 | 1495 | 1.1091 | 80603072 |
0.1424 | 0.9185 | 1500 | 1.1098 | 80871376 |
0.1266 | 0.9215 | 1505 | 1.1100 | 81142608 |
0.1014 | 0.9246 | 1510 | 1.1097 | 81409856 |
0.1246 | 0.9277 | 1515 | 1.1097 | 81678432 |
0.0949 | 0.9307 | 1520 | 1.1116 | 81946976 |
0.1001 | 0.9338 | 1525 | 1.1109 | 82218240 |
0.1632 | 0.9369 | 1530 | 1.1092 | 82490040 |
0.1727 | 0.9399 | 1535 | 1.1078 | 82757488 |
0.1399 | 0.9430 | 1540 | 1.1068 | 83023952 |
0.1313 | 0.9460 | 1545 | 1.1088 | 83287480 |
0.0864 | 0.9491 | 1550 | 1.1074 | 83555232 |
0.1429 | 0.9522 | 1555 | 1.1071 | 83827824 |
0.1301 | 0.9552 | 1560 | 1.1095 | 84098432 |
0.1525 | 0.9583 | 1565 | 1.1081 | 84366904 |
0.1482 | 0.9613 | 1570 | 1.1069 | 84642272 |
0.0756 | 0.9644 | 1575 | 1.1078 | 84911192 |
0.1277 | 0.9675 | 1580 | 1.1078 | 85182424 |
0.1131 | 0.9705 | 1585 | 1.1072 | 85453760 |
0.141 | 0.9736 | 1590 | 1.1057 | 85719912 |
0.2215 | 0.9767 | 1595 | 1.1045 | 86002952 |
0.1193 | 0.9797 | 1600 | 1.1074 | 86273712 |
0.0915 | 0.9828 | 1605 | 1.1086 | 86543568 |
0.1059 | 0.9858 | 1610 | 1.1063 | 86810536 |
0.0992 | 0.9889 | 1615 | 1.1051 | 87081208 |
0.0997 | 0.9920 | 1620 | 1.1072 | 87350088 |
0.1491 | 0.9950 | 1625 | 1.1078 | 87619224 |
0.1224 | 0.9981 | 1630 | 1.1065 | 87890944 |
Framework versions
- Transformers 4.44.0
- Pytorch 2.4.0+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 13
Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_accumulate_iter17_sftsd2
Base model
google/gemma-2-2b