impossible-llms-spanish-natural
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 6.8184
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 12
- eval_batch_size: 8
- seed: 0
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 8
- total_train_batch_size: 384
- total_eval_batch_size: 32
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- training_steps: 3000
- mixed_precision_training: Native AMP
- label_smoothing_factor: 0.1
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
51.7482 | 1.0 | 8 | 10.1009 |
47.1136 | 2.0 | 16 | 9.3242 |
45.0853 | 3.0 | 24 | 8.9890 |
44.3998 | 4.0 | 32 | 8.8363 |
43.4616 | 5.0 | 40 | 8.6399 |
42.535 | 6.0 | 48 | 8.4361 |
41.3844 | 7.0 | 56 | 8.2123 |
40.0352 | 8.0 | 64 | 7.9688 |
39.0033 | 9.0 | 72 | 7.7402 |
37.8514 | 10.0 | 80 | 7.5173 |
36.5612 | 11.0 | 88 | 7.2928 |
35.4609 | 12.0 | 96 | 7.0676 |
34.5723 | 13.0 | 104 | 6.8513 |
33.2546 | 14.0 | 112 | 6.6514 |
32.4976 | 15.0 | 120 | 6.4657 |
31.7763 | 16.0 | 128 | 6.3143 |
31.025 | 17.0 | 136 | 6.1922 |
30.5242 | 18.0 | 144 | 6.1013 |
30.0561 | 19.0 | 152 | 6.0031 |
29.5901 | 20.0 | 160 | 5.9325 |
29.4545 | 21.0 | 168 | 5.8617 |
28.9811 | 22.0 | 176 | 5.8039 |
28.8831 | 23.0 | 184 | 5.7678 |
28.4813 | 24.0 | 192 | 5.7255 |
28.2451 | 25.0 | 200 | 5.6767 |
28.1487 | 26.0 | 208 | 5.6355 |
27.8517 | 27.0 | 216 | 5.6016 |
27.7778 | 28.0 | 224 | 5.5588 |
27.5518 | 29.0 | 232 | 5.5329 |
27.4451 | 30.0 | 240 | 5.5022 |
27.3166 | 31.0 | 248 | 5.4869 |
27.0987 | 32.0 | 256 | 5.4511 |
26.7841 | 33.0 | 264 | 5.4280 |
26.8971 | 34.0 | 272 | 5.4120 |
26.3692 | 35.0 | 280 | 5.3898 |
26.4659 | 36.0 | 288 | 5.3737 |
26.2954 | 37.0 | 296 | 5.3614 |
26.2919 | 38.0 | 304 | 5.3355 |
26.1314 | 39.0 | 312 | 5.3200 |
25.9263 | 40.0 | 320 | 5.3024 |
25.935 | 41.0 | 328 | 5.2871 |
25.6915 | 42.0 | 336 | 5.2826 |
25.4353 | 43.0 | 344 | 5.2606 |
25.5488 | 44.0 | 352 | 5.2499 |
25.4338 | 45.0 | 360 | 5.2350 |
25.2367 | 46.0 | 368 | 5.2183 |
25.2789 | 47.0 | 376 | 5.2069 |
24.9316 | 48.0 | 384 | 5.1947 |
24.6801 | 49.0 | 392 | 5.1896 |
24.8224 | 50.0 | 400 | 5.1728 |
24.6572 | 51.0 | 408 | 5.1657 |
24.3244 | 52.0 | 416 | 5.1515 |
24.412 | 53.0 | 424 | 5.1430 |
24.1812 | 54.0 | 432 | 5.1323 |
23.9904 | 55.0 | 440 | 5.1222 |
24.0799 | 56.0 | 448 | 5.1179 |
23.9543 | 57.0 | 456 | 5.1072 |
23.7952 | 58.0 | 464 | 5.0991 |
23.5379 | 59.0 | 472 | 5.0910 |
23.4634 | 60.0 | 480 | 5.0932 |
23.335 | 61.0 | 488 | 5.0834 |
23.2467 | 62.0 | 496 | 5.0836 |
23.1341 | 63.0 | 504 | 5.0820 |
23.1521 | 64.0 | 512 | 5.0685 |
22.7104 | 65.0 | 520 | 5.0679 |
22.6608 | 66.0 | 528 | 5.0670 |
22.6397 | 67.0 | 536 | 5.0661 |
22.422 | 68.0 | 544 | 5.0739 |
22.4531 | 69.0 | 552 | 5.0714 |
22.2173 | 70.0 | 560 | 5.0637 |
22.2348 | 71.0 | 568 | 5.0732 |
22.0378 | 72.0 | 576 | 5.0730 |
22.0102 | 73.0 | 584 | 5.0752 |
21.7305 | 74.0 | 592 | 5.0790 |
21.702 | 75.0 | 600 | 5.0820 |
21.7926 | 76.0 | 608 | 5.0809 |
21.554 | 77.0 | 616 | 5.0860 |
21.5144 | 78.0 | 624 | 5.0952 |
21.1927 | 79.0 | 632 | 5.1032 |
21.0842 | 80.0 | 640 | 5.1086 |
21.088 | 81.0 | 648 | 5.1133 |
20.8599 | 82.0 | 656 | 5.1215 |
20.7647 | 83.0 | 664 | 5.1244 |
20.7258 | 84.0 | 672 | 5.1338 |
20.6174 | 85.0 | 680 | 5.1403 |
20.4121 | 86.0 | 688 | 5.1380 |
20.3097 | 87.0 | 696 | 5.1572 |
20.2016 | 88.0 | 704 | 5.1605 |
20.0856 | 89.0 | 712 | 5.1729 |
20.0558 | 90.0 | 720 | 5.1826 |
19.8661 | 91.0 | 728 | 5.2008 |
19.6604 | 92.0 | 736 | 5.1983 |
19.7707 | 93.0 | 744 | 5.2132 |
19.5645 | 94.0 | 752 | 5.2248 |
19.563 | 95.0 | 760 | 5.2305 |
19.356 | 96.0 | 768 | 5.2506 |
19.3115 | 97.0 | 776 | 5.2607 |
19.317 | 98.0 | 784 | 5.2729 |
18.9184 | 99.0 | 792 | 5.2840 |
18.9451 | 100.0 | 800 | 5.2899 |
18.9796 | 101.0 | 808 | 5.2987 |
18.8531 | 102.0 | 816 | 5.3146 |
18.5326 | 103.0 | 824 | 5.3257 |
18.6404 | 104.0 | 832 | 5.3395 |
18.5631 | 105.0 | 840 | 5.3454 |
18.375 | 106.0 | 848 | 5.3609 |
18.3254 | 107.0 | 856 | 5.3662 |
18.2828 | 108.0 | 864 | 5.3731 |
17.9908 | 109.0 | 872 | 5.3853 |
17.8628 | 110.0 | 880 | 5.4045 |
17.8125 | 111.0 | 888 | 5.4169 |
17.7775 | 112.0 | 896 | 5.4288 |
17.6947 | 113.0 | 904 | 5.4396 |
17.5524 | 114.0 | 912 | 5.4574 |
17.6086 | 115.0 | 920 | 5.4649 |
17.5326 | 116.0 | 928 | 5.4638 |
17.3661 | 117.0 | 936 | 5.4946 |
17.2628 | 118.0 | 944 | 5.5055 |
17.1431 | 119.0 | 952 | 5.5055 |
17.1413 | 120.0 | 960 | 5.5349 |
17.0927 | 121.0 | 968 | 5.5345 |
16.8162 | 122.0 | 976 | 5.5575 |
16.6914 | 123.0 | 984 | 5.5597 |
16.6333 | 124.0 | 992 | 5.5778 |
16.6871 | 125.0 | 1000 | 5.5835 |
16.5154 | 126.0 | 1008 | 5.5979 |
16.5725 | 127.0 | 1016 | 5.6137 |
16.3883 | 128.0 | 1024 | 5.6261 |
16.3495 | 129.0 | 1032 | 5.6381 |
16.1944 | 130.0 | 1040 | 5.6494 |
16.1893 | 131.0 | 1048 | 5.6676 |
16.1091 | 132.0 | 1056 | 5.6876 |
15.9974 | 133.0 | 1064 | 5.6895 |
15.9566 | 134.0 | 1072 | 5.6994 |
15.7183 | 135.0 | 1080 | 5.7051 |
15.7345 | 136.0 | 1088 | 5.7222 |
15.7121 | 137.0 | 1096 | 5.7261 |
15.6825 | 138.0 | 1104 | 5.7487 |
15.5537 | 139.0 | 1112 | 5.7612 |
15.4367 | 140.0 | 1120 | 5.7742 |
15.467 | 141.0 | 1128 | 5.7805 |
15.3692 | 142.0 | 1136 | 5.7981 |
15.3071 | 143.0 | 1144 | 5.8083 |
15.2233 | 144.0 | 1152 | 5.8245 |
15.0736 | 145.0 | 1160 | 5.8318 |
15.0534 | 146.0 | 1168 | 5.8394 |
14.9743 | 147.0 | 1176 | 5.8561 |
14.8758 | 148.0 | 1184 | 5.8611 |
14.8195 | 149.0 | 1192 | 5.8811 |
14.7533 | 150.0 | 1200 | 5.8840 |
14.6149 | 151.0 | 1208 | 5.8999 |
14.6015 | 152.0 | 1216 | 5.9174 |
14.5694 | 153.0 | 1224 | 5.9121 |
14.4739 | 154.0 | 1232 | 5.9283 |
14.5006 | 155.0 | 1240 | 5.9400 |
14.3506 | 156.0 | 1248 | 5.9536 |
14.3135 | 157.0 | 1256 | 5.9643 |
14.2873 | 158.0 | 1264 | 5.9644 |
14.1698 | 159.0 | 1272 | 5.9845 |
14.159 | 160.0 | 1280 | 5.9881 |
14.0499 | 161.0 | 1288 | 6.0032 |
13.9969 | 162.0 | 1296 | 6.0161 |
13.9918 | 163.0 | 1304 | 6.0177 |
13.989 | 164.0 | 1312 | 6.0409 |
13.8361 | 165.0 | 1320 | 6.0419 |
13.8109 | 166.0 | 1328 | 6.0592 |
13.7056 | 167.0 | 1336 | 6.0604 |
13.6535 | 168.0 | 1344 | 6.0735 |
13.6727 | 169.0 | 1352 | 6.0859 |
13.4892 | 170.0 | 1360 | 6.0983 |
13.5187 | 171.0 | 1368 | 6.1006 |
13.416 | 172.0 | 1376 | 6.1236 |
13.4031 | 173.0 | 1384 | 6.1257 |
13.3737 | 174.0 | 1392 | 6.1196 |
13.3202 | 175.0 | 1400 | 6.1495 |
13.2018 | 176.0 | 1408 | 6.1524 |
13.1343 | 177.0 | 1416 | 6.1722 |
13.1208 | 178.0 | 1424 | 6.1700 |
13.068 | 179.0 | 1432 | 6.1792 |
12.9428 | 180.0 | 1440 | 6.1842 |
12.9316 | 181.0 | 1448 | 6.1961 |
12.966 | 182.0 | 1456 | 6.2035 |
12.9418 | 183.0 | 1464 | 6.2125 |
12.7829 | 184.0 | 1472 | 6.2231 |
12.796 | 185.0 | 1480 | 6.2370 |
12.75 | 186.0 | 1488 | 6.2376 |
12.6302 | 187.0 | 1496 | 6.2485 |
12.6604 | 188.0 | 1504 | 6.2562 |
12.593 | 189.0 | 1512 | 6.2658 |
12.6442 | 190.0 | 1520 | 6.2673 |
12.4919 | 191.0 | 1528 | 6.2805 |
12.4591 | 192.0 | 1536 | 6.2892 |
12.4038 | 193.0 | 1544 | 6.3006 |
12.3682 | 194.0 | 1552 | 6.3081 |
12.4278 | 195.0 | 1560 | 6.3119 |
12.3047 | 196.0 | 1568 | 6.3168 |
12.2657 | 197.0 | 1576 | 6.3308 |
12.1921 | 198.0 | 1584 | 6.3365 |
12.1392 | 199.0 | 1592 | 6.3429 |
12.1538 | 200.0 | 1600 | 6.3506 |
12.0806 | 201.0 | 1608 | 6.3562 |
12.0888 | 202.0 | 1616 | 6.3606 |
12.0709 | 203.0 | 1624 | 6.3710 |
11.8803 | 204.0 | 1632 | 6.3793 |
11.9575 | 205.0 | 1640 | 6.3862 |
11.9468 | 206.0 | 1648 | 6.3876 |
11.9192 | 207.0 | 1656 | 6.3980 |
11.8949 | 208.0 | 1664 | 6.4026 |
11.8611 | 209.0 | 1672 | 6.4132 |
11.8237 | 210.0 | 1680 | 6.4116 |
11.7115 | 211.0 | 1688 | 6.4298 |
11.7199 | 212.0 | 1696 | 6.4204 |
11.6559 | 213.0 | 1704 | 6.4333 |
11.6339 | 214.0 | 1712 | 6.4446 |
11.6064 | 215.0 | 1720 | 6.4457 |
11.5941 | 216.0 | 1728 | 6.4505 |
11.5943 | 217.0 | 1736 | 6.4660 |
11.5027 | 218.0 | 1744 | 6.4639 |
11.5226 | 219.0 | 1752 | 6.4743 |
11.4428 | 220.0 | 1760 | 6.4731 |
11.3947 | 221.0 | 1768 | 6.4823 |
11.4507 | 222.0 | 1776 | 6.4905 |
11.3622 | 223.0 | 1784 | 6.4986 |
11.3231 | 224.0 | 1792 | 6.5073 |
11.2868 | 225.0 | 1800 | 6.5093 |
11.2595 | 226.0 | 1808 | 6.5114 |
11.2187 | 227.0 | 1816 | 6.5212 |
11.2437 | 228.0 | 1824 | 6.5231 |
11.2354 | 229.0 | 1832 | 6.5265 |
11.2185 | 230.0 | 1840 | 6.5331 |
11.1903 | 231.0 | 1848 | 6.5374 |
11.1166 | 232.0 | 1856 | 6.5424 |
11.1199 | 233.0 | 1864 | 6.5454 |
11.092 | 234.0 | 1872 | 6.5530 |
11.1067 | 235.0 | 1880 | 6.5634 |
11.0527 | 236.0 | 1888 | 6.5692 |
11.0075 | 237.0 | 1896 | 6.5730 |
10.9491 | 238.0 | 1904 | 6.5732 |
10.9812 | 239.0 | 1912 | 6.5781 |
10.8566 | 240.0 | 1920 | 6.5769 |
10.9052 | 241.0 | 1928 | 6.5863 |
10.9273 | 242.0 | 1936 | 6.5914 |
10.8682 | 243.0 | 1944 | 6.5963 |
10.8243 | 244.0 | 1952 | 6.5998 |
10.8447 | 245.0 | 1960 | 6.6061 |
10.8102 | 246.0 | 1968 | 6.6071 |
10.7664 | 247.0 | 1976 | 6.6137 |
10.8312 | 248.0 | 1984 | 6.6259 |
10.6884 | 249.0 | 1992 | 6.6233 |
10.7115 | 250.0 | 2000 | 6.6312 |
10.7217 | 251.0 | 2008 | 6.6256 |
10.6847 | 252.0 | 2016 | 6.6327 |
10.6836 | 253.0 | 2024 | 6.6431 |
10.637 | 254.0 | 2032 | 6.6433 |
10.5986 | 255.0 | 2040 | 6.6483 |
10.5909 | 256.0 | 2048 | 6.6469 |
10.5634 | 257.0 | 2056 | 6.6528 |
10.4946 | 258.0 | 2064 | 6.6533 |
10.5208 | 259.0 | 2072 | 6.6567 |
10.4965 | 260.0 | 2080 | 6.6610 |
10.4934 | 261.0 | 2088 | 6.6725 |
10.4299 | 262.0 | 2096 | 6.6749 |
10.4986 | 263.0 | 2104 | 6.6818 |
10.4502 | 264.0 | 2112 | 6.6766 |
10.4203 | 265.0 | 2120 | 6.6822 |
10.4052 | 266.0 | 2128 | 6.6795 |
10.4339 | 267.0 | 2136 | 6.6875 |
10.3813 | 268.0 | 2144 | 6.6916 |
10.3406 | 269.0 | 2152 | 6.6874 |
10.3575 | 270.0 | 2160 | 6.6944 |
10.3259 | 271.0 | 2168 | 6.7024 |
10.3898 | 272.0 | 2176 | 6.7016 |
10.2788 | 273.0 | 2184 | 6.7059 |
10.2602 | 274.0 | 2192 | 6.7072 |
10.2348 | 275.0 | 2200 | 6.7081 |
10.2383 | 276.0 | 2208 | 6.7134 |
10.316 | 277.0 | 2216 | 6.7189 |
10.2411 | 278.0 | 2224 | 6.7181 |
10.2012 | 279.0 | 2232 | 6.7194 |
10.1992 | 280.0 | 2240 | 6.7229 |
10.1965 | 281.0 | 2248 | 6.7239 |
10.2203 | 282.0 | 2256 | 6.7193 |
10.1755 | 283.0 | 2264 | 6.7321 |
10.1564 | 284.0 | 2272 | 6.7296 |
10.1029 | 285.0 | 2280 | 6.7375 |
10.1323 | 286.0 | 2288 | 6.7398 |
10.0811 | 287.0 | 2296 | 6.7425 |
10.107 | 288.0 | 2304 | 6.7461 |
10.1342 | 289.0 | 2312 | 6.7479 |
10.1482 | 290.0 | 2320 | 6.7445 |
10.0679 | 291.0 | 2328 | 6.7457 |
10.0919 | 292.0 | 2336 | 6.7526 |
10.0352 | 293.0 | 2344 | 6.7521 |
10.0986 | 294.0 | 2352 | 6.7522 |
10.0199 | 295.0 | 2360 | 6.7572 |
9.9978 | 296.0 | 2368 | 6.7582 |
10.0085 | 297.0 | 2376 | 6.7607 |
9.9583 | 298.0 | 2384 | 6.7653 |
10.058 | 299.0 | 2392 | 6.7660 |
10.006 | 300.0 | 2400 | 6.7694 |
9.9744 | 301.0 | 2408 | 6.7687 |
9.9453 | 302.0 | 2416 | 6.7729 |
9.9084 | 303.0 | 2424 | 6.7744 |
9.9569 | 304.0 | 2432 | 6.7738 |
9.9564 | 305.0 | 2440 | 6.7771 |
9.9217 | 306.0 | 2448 | 6.7791 |
9.9595 | 307.0 | 2456 | 6.7833 |
9.9157 | 308.0 | 2464 | 6.7840 |
9.9505 | 309.0 | 2472 | 6.7788 |
9.8902 | 310.0 | 2480 | 6.7804 |
9.913 | 311.0 | 2488 | 6.7810 |
9.8953 | 312.0 | 2496 | 6.7854 |
9.8388 | 313.0 | 2504 | 6.7866 |
9.8951 | 314.0 | 2512 | 6.7877 |
9.8956 | 315.0 | 2520 | 6.7907 |
9.8355 | 316.0 | 2528 | 6.7902 |
9.9097 | 317.0 | 2536 | 6.7918 |
9.8304 | 318.0 | 2544 | 6.7911 |
9.8385 | 319.0 | 2552 | 6.7912 |
9.8874 | 320.0 | 2560 | 6.7946 |
9.8086 | 321.0 | 2568 | 6.7969 |
9.8169 | 322.0 | 2576 | 6.7963 |
9.7993 | 323.0 | 2584 | 6.7970 |
9.8324 | 324.0 | 2592 | 6.7978 |
9.8391 | 325.0 | 2600 | 6.7984 |
9.8765 | 326.0 | 2608 | 6.8019 |
9.7892 | 327.0 | 2616 | 6.8017 |
9.7902 | 328.0 | 2624 | 6.8017 |
9.8015 | 329.0 | 2632 | 6.8038 |
9.8687 | 330.0 | 2640 | 6.8047 |
9.7813 | 331.0 | 2648 | 6.8049 |
9.7734 | 332.0 | 2656 | 6.8059 |
9.8321 | 333.0 | 2664 | 6.8087 |
9.7921 | 334.0 | 2672 | 6.8071 |
9.7955 | 335.0 | 2680 | 6.8085 |
9.7474 | 336.0 | 2688 | 6.8057 |
9.7734 | 337.0 | 2696 | 6.8117 |
9.7687 | 338.0 | 2704 | 6.8099 |
9.7499 | 339.0 | 2712 | 6.8132 |
9.7645 | 340.0 | 2720 | 6.8108 |
9.7615 | 341.0 | 2728 | 6.8119 |
9.7293 | 342.0 | 2736 | 6.8126 |
9.7555 | 343.0 | 2744 | 6.8127 |
9.7456 | 344.0 | 2752 | 6.8131 |
9.7821 | 345.0 | 2760 | 6.8145 |
9.7462 | 346.0 | 2768 | 6.8143 |
9.8005 | 347.0 | 2776 | 6.8151 |
9.7558 | 348.0 | 2784 | 6.8156 |
9.7434 | 349.0 | 2792 | 6.8151 |
9.7548 | 350.0 | 2800 | 6.8159 |
9.718 | 351.0 | 2808 | 6.8175 |
9.7192 | 352.0 | 2816 | 6.8159 |
9.7142 | 353.0 | 2824 | 6.8166 |
9.7395 | 354.0 | 2832 | 6.8176 |
9.7601 | 355.0 | 2840 | 6.8173 |
9.7876 | 356.0 | 2848 | 6.8175 |
9.7267 | 357.0 | 2856 | 6.8177 |
9.8034 | 358.0 | 2864 | 6.8171 |
9.7382 | 359.0 | 2872 | 6.8170 |
9.7534 | 360.0 | 2880 | 6.8178 |
9.7423 | 361.0 | 2888 | 6.8182 |
9.7001 | 362.0 | 2896 | 6.8180 |
9.704 | 363.0 | 2904 | 6.8181 |
9.7615 | 364.0 | 2912 | 6.8186 |
9.7132 | 365.0 | 2920 | 6.8186 |
9.6365 | 366.0 | 2928 | 6.8184 |
9.7542 | 367.0 | 2936 | 6.8183 |
9.7524 | 368.0 | 2944 | 6.8183 |
9.7454 | 369.0 | 2952 | 6.8183 |
9.7615 | 370.0 | 2960 | 6.8182 |
9.7779 | 371.0 | 2968 | 6.8183 |
9.7548 | 372.0 | 2976 | 6.8183 |
9.7373 | 373.0 | 2984 | 6.8183 |
9.7027 | 374.0 | 2992 | 6.8184 |
9.7623 | 375.0 | 3000 | 6.8184 |
Framework versions
- Transformers 4.49.0
- Pytorch 2.4.0+cu121
- Datasets 3.4.0
- Tokenizers 0.21.0
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support