impossible-llms-spanish-random-fourgram
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 7.8039
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 12
- eval_batch_size: 8
- seed: 0
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 8
- total_train_batch_size: 384
- total_eval_batch_size: 32
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- training_steps: 3000
- mixed_precision_training: Native AMP
- label_smoothing_factor: 0.1
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
82.9763 | 1.0 | 8 | 10.1322 |
75.8242 | 2.0 | 16 | 9.3826 |
73.0479 | 3.0 | 24 | 9.0547 |
71.7498 | 4.0 | 32 | 8.9132 |
70.3057 | 5.0 | 40 | 8.7517 |
69.052 | 6.0 | 48 | 8.5917 |
67.5792 | 7.0 | 56 | 8.4113 |
66.1479 | 8.0 | 64 | 8.1993 |
64.2954 | 9.0 | 72 | 7.9892 |
62.6406 | 10.0 | 80 | 7.7753 |
60.8006 | 11.0 | 88 | 7.5550 |
58.9376 | 12.0 | 96 | 7.3313 |
57.1674 | 13.0 | 104 | 7.1176 |
55.776 | 14.0 | 112 | 6.9191 |
54.0096 | 15.0 | 120 | 6.7468 |
53.058 | 16.0 | 128 | 6.6053 |
52.1354 | 17.0 | 136 | 6.4908 |
51.3954 | 18.0 | 144 | 6.4133 |
50.8581 | 19.0 | 152 | 6.3606 |
50.3787 | 20.0 | 160 | 6.3124 |
50.3071 | 21.0 | 168 | 6.2731 |
49.996 | 22.0 | 176 | 6.2434 |
49.3843 | 23.0 | 184 | 6.1913 |
49.2434 | 24.0 | 192 | 6.1592 |
49.0138 | 25.0 | 200 | 6.1245 |
48.5189 | 26.0 | 208 | 6.0972 |
48.3509 | 27.0 | 216 | 6.0685 |
48.2833 | 28.0 | 224 | 6.0437 |
47.6648 | 29.0 | 232 | 6.0185 |
47.6211 | 30.0 | 240 | 5.9975 |
47.4067 | 31.0 | 248 | 5.9759 |
47.2849 | 32.0 | 256 | 5.9579 |
46.9754 | 33.0 | 264 | 5.9414 |
46.9424 | 34.0 | 272 | 5.9332 |
46.761 | 35.0 | 280 | 5.9065 |
46.4459 | 36.0 | 288 | 5.8899 |
46.2071 | 37.0 | 296 | 5.8721 |
46.1396 | 38.0 | 304 | 5.8573 |
45.9891 | 39.0 | 312 | 5.8446 |
45.6531 | 40.0 | 320 | 5.8358 |
45.4177 | 41.0 | 328 | 5.8188 |
45.4286 | 42.0 | 336 | 5.8082 |
44.9946 | 43.0 | 344 | 5.7904 |
44.9944 | 44.0 | 352 | 5.7830 |
44.727 | 45.0 | 360 | 5.7677 |
44.3485 | 46.0 | 368 | 5.7522 |
44.4296 | 47.0 | 376 | 5.7454 |
44.2211 | 48.0 | 384 | 5.7408 |
44.11 | 49.0 | 392 | 5.7288 |
43.9877 | 50.0 | 400 | 5.7172 |
43.5827 | 51.0 | 408 | 5.7082 |
43.5725 | 52.0 | 416 | 5.7019 |
43.3969 | 53.0 | 424 | 5.6927 |
43.1558 | 54.0 | 432 | 5.6875 |
42.8696 | 55.0 | 440 | 5.6895 |
42.8345 | 56.0 | 448 | 5.6810 |
42.4876 | 57.0 | 456 | 5.6749 |
42.4041 | 58.0 | 464 | 5.6796 |
42.167 | 59.0 | 472 | 5.6686 |
42.0272 | 60.0 | 480 | 5.6692 |
41.9472 | 61.0 | 488 | 5.6711 |
41.817 | 62.0 | 496 | 5.6665 |
41.5342 | 63.0 | 504 | 5.6705 |
41.3826 | 64.0 | 512 | 5.6707 |
41.046 | 65.0 | 520 | 5.6737 |
40.9818 | 66.0 | 528 | 5.6719 |
40.8324 | 67.0 | 536 | 5.6784 |
40.5345 | 68.0 | 544 | 5.6842 |
40.5791 | 69.0 | 552 | 5.6817 |
40.3238 | 70.0 | 560 | 5.6913 |
40.2274 | 71.0 | 568 | 5.6917 |
40.0427 | 72.0 | 576 | 5.7009 |
39.6969 | 73.0 | 584 | 5.7088 |
39.4668 | 74.0 | 592 | 5.7189 |
39.3444 | 75.0 | 600 | 5.7193 |
39.1904 | 76.0 | 608 | 5.7337 |
38.9538 | 77.0 | 616 | 5.7351 |
38.8954 | 78.0 | 624 | 5.7491 |
38.3855 | 79.0 | 632 | 5.7509 |
38.4417 | 80.0 | 640 | 5.7696 |
38.3127 | 81.0 | 648 | 5.7773 |
38.1975 | 82.0 | 656 | 5.7909 |
37.9492 | 83.0 | 664 | 5.7997 |
37.5814 | 84.0 | 672 | 5.8131 |
37.5652 | 85.0 | 680 | 5.8178 |
37.3111 | 86.0 | 688 | 5.8396 |
37.0125 | 87.0 | 696 | 5.8435 |
37.1248 | 88.0 | 704 | 5.8526 |
36.7036 | 89.0 | 712 | 5.8648 |
36.6775 | 90.0 | 720 | 5.8754 |
36.3811 | 91.0 | 728 | 5.8901 |
36.3064 | 92.0 | 736 | 5.9137 |
35.9747 | 93.0 | 744 | 5.9306 |
35.9777 | 94.0 | 752 | 5.9342 |
35.6134 | 95.0 | 760 | 5.9527 |
35.3654 | 96.0 | 768 | 5.9639 |
35.3432 | 97.0 | 776 | 5.9763 |
35.2641 | 98.0 | 784 | 5.9824 |
35.039 | 99.0 | 792 | 5.9972 |
34.704 | 100.0 | 800 | 6.0233 |
34.6955 | 101.0 | 808 | 6.0342 |
34.4793 | 102.0 | 816 | 6.0511 |
34.2634 | 103.0 | 824 | 6.0523 |
34.3291 | 104.0 | 832 | 6.0636 |
34.0863 | 105.0 | 840 | 6.0937 |
33.6904 | 106.0 | 848 | 6.1048 |
33.5764 | 107.0 | 856 | 6.1333 |
33.4184 | 108.0 | 864 | 6.1246 |
33.2091 | 109.0 | 872 | 6.1466 |
33.1212 | 110.0 | 880 | 6.1638 |
32.8773 | 111.0 | 888 | 6.1685 |
32.8244 | 112.0 | 896 | 6.1778 |
32.563 | 113.0 | 904 | 6.2110 |
32.5085 | 114.0 | 912 | 6.2149 |
32.0799 | 115.0 | 920 | 6.2259 |
32.0791 | 116.0 | 928 | 6.2462 |
31.9647 | 117.0 | 936 | 6.2711 |
31.7568 | 118.0 | 944 | 6.2816 |
31.6016 | 119.0 | 952 | 6.2748 |
31.4066 | 120.0 | 960 | 6.3229 |
31.3133 | 121.0 | 968 | 6.3207 |
31.1978 | 122.0 | 976 | 6.3222 |
30.86 | 123.0 | 984 | 6.3559 |
30.8673 | 124.0 | 992 | 6.3625 |
30.6247 | 125.0 | 1000 | 6.3782 |
30.5435 | 126.0 | 1008 | 6.3907 |
30.4076 | 127.0 | 1016 | 6.4128 |
30.2195 | 128.0 | 1024 | 6.4249 |
30.1669 | 129.0 | 1032 | 6.4416 |
29.9457 | 130.0 | 1040 | 6.4512 |
29.6963 | 131.0 | 1048 | 6.4563 |
29.6095 | 132.0 | 1056 | 6.4673 |
29.5896 | 133.0 | 1064 | 6.4934 |
29.3053 | 134.0 | 1072 | 6.5024 |
29.2496 | 135.0 | 1080 | 6.5245 |
29.0821 | 136.0 | 1088 | 6.5266 |
28.9603 | 137.0 | 1096 | 6.5416 |
28.6974 | 138.0 | 1104 | 6.5673 |
28.6211 | 139.0 | 1112 | 6.5702 |
28.455 | 140.0 | 1120 | 6.5887 |
28.3855 | 141.0 | 1128 | 6.6114 |
28.1946 | 142.0 | 1136 | 6.6128 |
27.9813 | 143.0 | 1144 | 6.6373 |
27.9906 | 144.0 | 1152 | 6.6382 |
27.8784 | 145.0 | 1160 | 6.6586 |
27.7172 | 146.0 | 1168 | 6.6575 |
27.5235 | 147.0 | 1176 | 6.6785 |
27.5081 | 148.0 | 1184 | 6.6878 |
27.3395 | 149.0 | 1192 | 6.6958 |
27.2361 | 150.0 | 1200 | 6.7060 |
26.948 | 151.0 | 1208 | 6.7346 |
26.9204 | 152.0 | 1216 | 6.7387 |
26.887 | 153.0 | 1224 | 6.7592 |
26.6024 | 154.0 | 1232 | 6.7647 |
26.5708 | 155.0 | 1240 | 6.7677 |
26.476 | 156.0 | 1248 | 6.7781 |
26.4475 | 157.0 | 1256 | 6.7916 |
26.2854 | 158.0 | 1264 | 6.8156 |
26.188 | 159.0 | 1272 | 6.8198 |
26.0161 | 160.0 | 1280 | 6.8430 |
26.024 | 161.0 | 1288 | 6.8527 |
25.7818 | 162.0 | 1296 | 6.8659 |
25.7232 | 163.0 | 1304 | 6.8707 |
25.5568 | 164.0 | 1312 | 6.8754 |
25.3005 | 165.0 | 1320 | 6.9005 |
25.2464 | 166.0 | 1328 | 6.9110 |
25.3019 | 167.0 | 1336 | 6.9115 |
25.0743 | 168.0 | 1344 | 6.9386 |
25.0131 | 169.0 | 1352 | 6.9445 |
24.9058 | 170.0 | 1360 | 6.9428 |
24.8692 | 171.0 | 1368 | 6.9654 |
24.6763 | 172.0 | 1376 | 6.9659 |
24.6819 | 173.0 | 1384 | 6.9842 |
24.4872 | 174.0 | 1392 | 6.9843 |
24.4557 | 175.0 | 1400 | 7.0022 |
24.3423 | 176.0 | 1408 | 7.0041 |
24.1534 | 177.0 | 1416 | 7.0141 |
24.0958 | 178.0 | 1424 | 7.0263 |
24.1577 | 179.0 | 1432 | 7.0363 |
23.8901 | 180.0 | 1440 | 7.0587 |
23.8961 | 181.0 | 1448 | 7.0664 |
23.7979 | 182.0 | 1456 | 7.0685 |
23.6696 | 183.0 | 1464 | 7.0815 |
23.4375 | 184.0 | 1472 | 7.0878 |
23.4381 | 185.0 | 1480 | 7.1067 |
23.3986 | 186.0 | 1488 | 7.1106 |
23.3941 | 187.0 | 1496 | 7.1363 |
23.2634 | 188.0 | 1504 | 7.1369 |
23.1304 | 189.0 | 1512 | 7.1397 |
23.0716 | 190.0 | 1520 | 7.1489 |
22.9345 | 191.0 | 1528 | 7.1453 |
22.9294 | 192.0 | 1536 | 7.1651 |
22.8221 | 193.0 | 1544 | 7.1774 |
22.7517 | 194.0 | 1552 | 7.1864 |
22.5896 | 195.0 | 1560 | 7.1922 |
22.6763 | 196.0 | 1568 | 7.2035 |
22.543 | 197.0 | 1576 | 7.2001 |
22.3312 | 198.0 | 1584 | 7.2166 |
22.2966 | 199.0 | 1592 | 7.2292 |
22.2998 | 200.0 | 1600 | 7.2368 |
22.2441 | 201.0 | 1608 | 7.2382 |
22.1787 | 202.0 | 1616 | 7.2474 |
22.036 | 203.0 | 1624 | 7.2559 |
22.0256 | 204.0 | 1632 | 7.2747 |
21.8264 | 205.0 | 1640 | 7.2887 |
21.847 | 206.0 | 1648 | 7.2940 |
21.812 | 207.0 | 1656 | 7.3009 |
21.7328 | 208.0 | 1664 | 7.3048 |
21.6812 | 209.0 | 1672 | 7.3017 |
21.512 | 210.0 | 1680 | 7.3180 |
21.5199 | 211.0 | 1688 | 7.3315 |
21.3427 | 212.0 | 1696 | 7.3404 |
21.3916 | 213.0 | 1704 | 7.3363 |
21.3311 | 214.0 | 1712 | 7.3467 |
21.2813 | 215.0 | 1720 | 7.3588 |
21.1493 | 216.0 | 1728 | 7.3633 |
21.0464 | 217.0 | 1736 | 7.3743 |
21.0938 | 218.0 | 1744 | 7.3795 |
21.0753 | 219.0 | 1752 | 7.3858 |
20.8909 | 220.0 | 1760 | 7.3916 |
20.8487 | 221.0 | 1768 | 7.3935 |
20.739 | 222.0 | 1776 | 7.4054 |
20.7887 | 223.0 | 1784 | 7.4154 |
20.6405 | 224.0 | 1792 | 7.4227 |
20.6447 | 225.0 | 1800 | 7.4266 |
20.4904 | 226.0 | 1808 | 7.4339 |
20.5563 | 227.0 | 1816 | 7.4355 |
20.4321 | 228.0 | 1824 | 7.4435 |
20.5369 | 229.0 | 1832 | 7.4500 |
20.3137 | 230.0 | 1840 | 7.4613 |
20.332 | 231.0 | 1848 | 7.4682 |
20.2773 | 232.0 | 1856 | 7.4766 |
20.1258 | 233.0 | 1864 | 7.4802 |
20.1112 | 234.0 | 1872 | 7.4889 |
20.0866 | 235.0 | 1880 | 7.4954 |
19.9962 | 236.0 | 1888 | 7.4912 |
19.9217 | 237.0 | 1896 | 7.4984 |
19.9182 | 238.0 | 1904 | 7.5087 |
19.8691 | 239.0 | 1912 | 7.5181 |
19.8146 | 240.0 | 1920 | 7.5179 |
19.7451 | 241.0 | 1928 | 7.5246 |
19.8442 | 242.0 | 1936 | 7.5331 |
19.7008 | 243.0 | 1944 | 7.5308 |
19.6039 | 244.0 | 1952 | 7.5359 |
19.5752 | 245.0 | 1960 | 7.5486 |
19.603 | 246.0 | 1968 | 7.5452 |
19.526 | 247.0 | 1976 | 7.5561 |
19.4567 | 248.0 | 1984 | 7.5610 |
19.4087 | 249.0 | 1992 | 7.5669 |
19.4502 | 250.0 | 2000 | 7.5744 |
19.3038 | 251.0 | 2008 | 7.5777 |
19.2949 | 252.0 | 2016 | 7.5868 |
19.1793 | 253.0 | 2024 | 7.5907 |
19.3054 | 254.0 | 2032 | 7.5956 |
19.2497 | 255.0 | 2040 | 7.5922 |
19.1428 | 256.0 | 2048 | 7.6026 |
19.1033 | 257.0 | 2056 | 7.5987 |
19.04 | 258.0 | 2064 | 7.6085 |
19.1332 | 259.0 | 2072 | 7.6133 |
19.1033 | 260.0 | 2080 | 7.6134 |
18.9681 | 261.0 | 2088 | 7.6296 |
18.9423 | 262.0 | 2096 | 7.6220 |
18.8202 | 263.0 | 2104 | 7.6296 |
18.9517 | 264.0 | 2112 | 7.6344 |
18.9199 | 265.0 | 2120 | 7.6375 |
18.7935 | 266.0 | 2128 | 7.6411 |
18.7623 | 267.0 | 2136 | 7.6387 |
18.7257 | 268.0 | 2144 | 7.6487 |
18.6256 | 269.0 | 2152 | 7.6554 |
18.6174 | 270.0 | 2160 | 7.6597 |
18.6969 | 271.0 | 2168 | 7.6578 |
18.6943 | 272.0 | 2176 | 7.6664 |
18.4693 | 273.0 | 2184 | 7.6733 |
18.5465 | 274.0 | 2192 | 7.6674 |
18.5389 | 275.0 | 2200 | 7.6751 |
18.4663 | 276.0 | 2208 | 7.6773 |
18.4731 | 277.0 | 2216 | 7.6769 |
18.353 | 278.0 | 2224 | 7.6844 |
18.3503 | 279.0 | 2232 | 7.6879 |
18.4071 | 280.0 | 2240 | 7.6856 |
18.4131 | 281.0 | 2248 | 7.6884 |
18.3167 | 282.0 | 2256 | 7.6958 |
18.3702 | 283.0 | 2264 | 7.6996 |
18.2415 | 284.0 | 2272 | 7.7069 |
18.2712 | 285.0 | 2280 | 7.7060 |
18.2295 | 286.0 | 2288 | 7.7062 |
18.2393 | 287.0 | 2296 | 7.7129 |
18.1915 | 288.0 | 2304 | 7.7102 |
18.1751 | 289.0 | 2312 | 7.7178 |
18.1592 | 290.0 | 2320 | 7.7193 |
18.0663 | 291.0 | 2328 | 7.7231 |
18.0893 | 292.0 | 2336 | 7.7209 |
18.0764 | 293.0 | 2344 | 7.7281 |
18.0296 | 294.0 | 2352 | 7.7293 |
18.0648 | 295.0 | 2360 | 7.7317 |
17.98 | 296.0 | 2368 | 7.7354 |
17.9572 | 297.0 | 2376 | 7.7357 |
18.0146 | 298.0 | 2384 | 7.7367 |
18.0386 | 299.0 | 2392 | 7.7416 |
17.9969 | 300.0 | 2400 | 7.7399 |
17.9766 | 301.0 | 2408 | 7.7446 |
18.0088 | 302.0 | 2416 | 7.7450 |
17.9913 | 303.0 | 2424 | 7.7492 |
17.9319 | 304.0 | 2432 | 7.7475 |
17.8594 | 305.0 | 2440 | 7.7524 |
17.9173 | 306.0 | 2448 | 7.7529 |
17.8649 | 307.0 | 2456 | 7.7508 |
17.8351 | 308.0 | 2464 | 7.7565 |
17.8638 | 309.0 | 2472 | 7.7632 |
17.8086 | 310.0 | 2480 | 7.7615 |
17.7833 | 311.0 | 2488 | 7.7627 |
17.7528 | 312.0 | 2496 | 7.7661 |
17.7985 | 313.0 | 2504 | 7.7679 |
17.643 | 314.0 | 2512 | 7.7694 |
17.6903 | 315.0 | 2520 | 7.7696 |
17.6834 | 316.0 | 2528 | 7.7720 |
17.7201 | 317.0 | 2536 | 7.7711 |
17.7552 | 318.0 | 2544 | 7.7751 |
17.6127 | 319.0 | 2552 | 7.7723 |
17.6698 | 320.0 | 2560 | 7.7758 |
17.6404 | 321.0 | 2568 | 7.7786 |
17.6297 | 322.0 | 2576 | 7.7802 |
17.5499 | 323.0 | 2584 | 7.7829 |
17.6372 | 324.0 | 2592 | 7.7820 |
17.5715 | 325.0 | 2600 | 7.7868 |
17.6256 | 326.0 | 2608 | 7.7871 |
17.5635 | 327.0 | 2616 | 7.7849 |
17.5831 | 328.0 | 2624 | 7.7874 |
17.6796 | 329.0 | 2632 | 7.7896 |
17.5551 | 330.0 | 2640 | 7.7881 |
17.5849 | 331.0 | 2648 | 7.7889 |
17.5815 | 332.0 | 2656 | 7.7904 |
17.5988 | 333.0 | 2664 | 7.7896 |
17.4424 | 334.0 | 2672 | 7.7922 |
17.4758 | 335.0 | 2680 | 7.7942 |
17.5238 | 336.0 | 2688 | 7.7924 |
17.5397 | 337.0 | 2696 | 7.7945 |
17.5032 | 338.0 | 2704 | 7.7939 |
17.5263 | 339.0 | 2712 | 7.7973 |
17.4627 | 340.0 | 2720 | 7.7952 |
17.471 | 341.0 | 2728 | 7.7981 |
17.5093 | 342.0 | 2736 | 7.8003 |
17.4304 | 343.0 | 2744 | 7.7967 |
17.4437 | 344.0 | 2752 | 7.7986 |
17.4244 | 345.0 | 2760 | 7.7994 |
17.447 | 346.0 | 2768 | 7.8000 |
17.5448 | 347.0 | 2776 | 7.8001 |
17.5153 | 348.0 | 2784 | 7.8014 |
17.5372 | 349.0 | 2792 | 7.8007 |
17.4114 | 350.0 | 2800 | 7.8015 |
17.3633 | 351.0 | 2808 | 7.8009 |
17.4397 | 352.0 | 2816 | 7.8017 |
17.387 | 353.0 | 2824 | 7.8022 |
17.5439 | 354.0 | 2832 | 7.8022 |
17.3737 | 355.0 | 2840 | 7.8020 |
17.4433 | 356.0 | 2848 | 7.8020 |
17.4149 | 357.0 | 2856 | 7.8023 |
17.4908 | 358.0 | 2864 | 7.8023 |
17.4608 | 359.0 | 2872 | 7.8033 |
17.3332 | 360.0 | 2880 | 7.8033 |
17.4148 | 361.0 | 2888 | 7.8037 |
17.3321 | 362.0 | 2896 | 7.8039 |
17.4242 | 363.0 | 2904 | 7.8036 |
17.4116 | 364.0 | 2912 | 7.8034 |
17.4484 | 365.0 | 2920 | 7.8036 |
17.4695 | 366.0 | 2928 | 7.8038 |
17.4316 | 367.0 | 2936 | 7.8039 |
17.4131 | 368.0 | 2944 | 7.8040 |
17.3338 | 369.0 | 2952 | 7.8040 |
17.3707 | 370.0 | 2960 | 7.8039 |
17.3902 | 371.0 | 2968 | 7.8039 |
17.3523 | 372.0 | 2976 | 7.8039 |
17.4936 | 373.0 | 2984 | 7.8039 |
17.4325 | 374.0 | 2992 | 7.8039 |
17.4207 | 375.0 | 3000 | 7.8039 |
Framework versions
- Transformers 4.49.0
- Pytorch 2.4.0+cu121
- Datasets 3.4.0
- Tokenizers 0.21.0
- Downloads last month
- 0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support