impossible-llms-spanish-fronting-n
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 7.4765
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 12
- eval_batch_size: 8
- seed: 0
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 8
- total_train_batch_size: 384
- total_eval_batch_size: 32
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- training_steps: 3000
- mixed_precision_training: Native AMP
- label_smoothing_factor: 0.1
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
51.7784 | 1.0 | 8 | 10.1116 |
47.2634 | 2.0 | 16 | 9.3470 |
45.3511 | 3.0 | 24 | 9.0155 |
44.5456 | 4.0 | 32 | 8.8750 |
43.7733 | 5.0 | 40 | 8.7163 |
42.7853 | 6.0 | 48 | 8.5225 |
41.7995 | 7.0 | 56 | 8.3087 |
40.4943 | 8.0 | 64 | 8.0867 |
39.302 | 9.0 | 72 | 7.8698 |
38.4185 | 10.0 | 80 | 7.6418 |
37.3145 | 11.0 | 88 | 7.4187 |
36.1962 | 12.0 | 96 | 7.1951 |
34.9774 | 13.0 | 104 | 6.9837 |
34.0831 | 14.0 | 112 | 6.7900 |
33.1721 | 15.0 | 120 | 6.6131 |
32.4419 | 16.0 | 128 | 6.4728 |
31.9161 | 17.0 | 136 | 6.3644 |
31.2219 | 18.0 | 144 | 6.2884 |
31.1113 | 19.0 | 152 | 6.2115 |
30.7667 | 20.0 | 160 | 6.1593 |
30.5015 | 21.0 | 168 | 6.1089 |
30.3263 | 22.0 | 176 | 6.0646 |
29.9469 | 23.0 | 184 | 6.0196 |
29.8472 | 24.0 | 192 | 5.9851 |
29.634 | 25.0 | 200 | 5.9474 |
29.3636 | 26.0 | 208 | 5.9247 |
29.3971 | 27.0 | 216 | 5.8943 |
28.9826 | 28.0 | 224 | 5.8664 |
28.9979 | 29.0 | 232 | 5.8435 |
28.7772 | 30.0 | 240 | 5.8160 |
28.8633 | 31.0 | 248 | 5.8035 |
28.8801 | 32.0 | 256 | 5.7748 |
28.613 | 33.0 | 264 | 5.7570 |
28.377 | 34.0 | 272 | 5.7357 |
28.2545 | 35.0 | 280 | 5.7216 |
28.2638 | 36.0 | 288 | 5.7025 |
27.9661 | 37.0 | 296 | 5.6895 |
27.8934 | 38.0 | 304 | 5.6784 |
27.8273 | 39.0 | 312 | 5.6638 |
27.5359 | 40.0 | 320 | 5.6506 |
27.4462 | 41.0 | 328 | 5.6327 |
27.4709 | 42.0 | 336 | 5.6117 |
27.3402 | 43.0 | 344 | 5.6071 |
27.1623 | 44.0 | 352 | 5.5898 |
27.0289 | 45.0 | 360 | 5.5780 |
26.8429 | 46.0 | 368 | 5.5608 |
26.7129 | 47.0 | 376 | 5.5495 |
26.6637 | 48.0 | 384 | 5.5411 |
26.5017 | 49.0 | 392 | 5.5316 |
26.5155 | 50.0 | 400 | 5.5264 |
26.4028 | 51.0 | 408 | 5.5115 |
26.1552 | 52.0 | 416 | 5.5025 |
26.1335 | 53.0 | 424 | 5.5021 |
25.9309 | 54.0 | 432 | 5.4959 |
25.8646 | 55.0 | 440 | 5.4891 |
25.7497 | 56.0 | 448 | 5.4911 |
25.4582 | 57.0 | 456 | 5.4858 |
25.507 | 58.0 | 464 | 5.4752 |
25.3367 | 59.0 | 472 | 5.4773 |
25.353 | 60.0 | 480 | 5.4720 |
25.0761 | 61.0 | 488 | 5.4754 |
25.0215 | 62.0 | 496 | 5.4691 |
25.0473 | 63.0 | 504 | 5.4730 |
24.6567 | 64.0 | 512 | 5.4700 |
24.7463 | 65.0 | 520 | 5.4742 |
24.6361 | 66.0 | 528 | 5.4784 |
24.4217 | 67.0 | 536 | 5.4747 |
24.3268 | 68.0 | 544 | 5.4747 |
24.264 | 69.0 | 552 | 5.4828 |
24.284 | 70.0 | 560 | 5.4972 |
24.1288 | 71.0 | 568 | 5.4933 |
23.9271 | 72.0 | 576 | 5.4963 |
23.9558 | 73.0 | 584 | 5.5114 |
23.8464 | 74.0 | 592 | 5.5095 |
23.5567 | 75.0 | 600 | 5.5177 |
23.4508 | 76.0 | 608 | 5.5224 |
23.3597 | 77.0 | 616 | 5.5401 |
23.2514 | 78.0 | 624 | 5.5444 |
23.0297 | 79.0 | 632 | 5.5501 |
22.8961 | 80.0 | 640 | 5.5634 |
22.9509 | 81.0 | 648 | 5.5759 |
22.856 | 82.0 | 656 | 5.5821 |
22.5475 | 83.0 | 664 | 5.5873 |
22.3554 | 84.0 | 672 | 5.6013 |
22.4708 | 85.0 | 680 | 5.6089 |
22.2423 | 86.0 | 688 | 5.6206 |
22.1425 | 87.0 | 696 | 5.6361 |
22.1266 | 88.0 | 704 | 5.6449 |
21.9551 | 89.0 | 712 | 5.6537 |
21.7956 | 90.0 | 720 | 5.6684 |
21.7506 | 91.0 | 728 | 5.6829 |
21.766 | 92.0 | 736 | 5.6955 |
21.4058 | 93.0 | 744 | 5.7129 |
21.5818 | 94.0 | 752 | 5.7187 |
21.5058 | 95.0 | 760 | 5.7243 |
21.3181 | 96.0 | 768 | 5.7320 |
21.0303 | 97.0 | 776 | 5.7566 |
20.9767 | 98.0 | 784 | 5.7750 |
20.8306 | 99.0 | 792 | 5.7804 |
20.8078 | 100.0 | 800 | 5.7926 |
20.5814 | 101.0 | 808 | 5.8065 |
20.5153 | 102.0 | 816 | 5.8193 |
20.33 | 103.0 | 824 | 5.8343 |
20.304 | 104.0 | 832 | 5.8458 |
20.4121 | 105.0 | 840 | 5.8602 |
20.1227 | 106.0 | 848 | 5.8680 |
20.107 | 107.0 | 856 | 5.8870 |
20.0244 | 108.0 | 864 | 5.8983 |
19.8429 | 109.0 | 872 | 5.9154 |
19.6658 | 110.0 | 880 | 5.9283 |
19.7151 | 111.0 | 888 | 5.9440 |
19.52 | 112.0 | 896 | 5.9515 |
19.4396 | 113.0 | 904 | 5.9701 |
19.3786 | 114.0 | 912 | 5.9825 |
19.2078 | 115.0 | 920 | 6.0011 |
19.059 | 116.0 | 928 | 6.0191 |
19.0577 | 117.0 | 936 | 6.0321 |
18.9654 | 118.0 | 944 | 6.0395 |
18.8006 | 119.0 | 952 | 6.0530 |
18.7847 | 120.0 | 960 | 6.0753 |
18.6216 | 121.0 | 968 | 6.0882 |
18.6164 | 122.0 | 976 | 6.1006 |
18.4312 | 123.0 | 984 | 6.1104 |
18.3627 | 124.0 | 992 | 6.1188 |
18.323 | 125.0 | 1000 | 6.1404 |
18.1081 | 126.0 | 1008 | 6.1458 |
18.1846 | 127.0 | 1016 | 6.1653 |
17.9619 | 128.0 | 1024 | 6.1814 |
17.8304 | 129.0 | 1032 | 6.1954 |
17.9639 | 130.0 | 1040 | 6.2034 |
17.7216 | 131.0 | 1048 | 6.2219 |
17.5679 | 132.0 | 1056 | 6.2268 |
17.6338 | 133.0 | 1064 | 6.2405 |
17.4533 | 134.0 | 1072 | 6.2574 |
17.3785 | 135.0 | 1080 | 6.2745 |
17.336 | 136.0 | 1088 | 6.2777 |
17.239 | 137.0 | 1096 | 6.3045 |
17.2146 | 138.0 | 1104 | 6.3152 |
16.9797 | 139.0 | 1112 | 6.3267 |
16.8837 | 140.0 | 1120 | 6.3396 |
16.9108 | 141.0 | 1128 | 6.3553 |
16.7512 | 142.0 | 1136 | 6.3601 |
16.7044 | 143.0 | 1144 | 6.3728 |
16.672 | 144.0 | 1152 | 6.3835 |
16.5641 | 145.0 | 1160 | 6.4016 |
16.5522 | 146.0 | 1168 | 6.4245 |
16.4778 | 147.0 | 1176 | 6.4233 |
16.3825 | 148.0 | 1184 | 6.4333 |
16.3415 | 149.0 | 1192 | 6.4436 |
16.1825 | 150.0 | 1200 | 6.4635 |
16.1146 | 151.0 | 1208 | 6.4792 |
16.034 | 152.0 | 1216 | 6.4810 |
16.015 | 153.0 | 1224 | 6.5045 |
15.8937 | 154.0 | 1232 | 6.5265 |
15.8686 | 155.0 | 1240 | 6.5210 |
15.7685 | 156.0 | 1248 | 6.5367 |
15.7502 | 157.0 | 1256 | 6.5394 |
15.7103 | 158.0 | 1264 | 6.5634 |
15.5698 | 159.0 | 1272 | 6.5660 |
15.5699 | 160.0 | 1280 | 6.5810 |
15.4384 | 161.0 | 1288 | 6.5823 |
15.303 | 162.0 | 1296 | 6.5966 |
15.2721 | 163.0 | 1304 | 6.6100 |
15.2977 | 164.0 | 1312 | 6.6249 |
15.2382 | 165.0 | 1320 | 6.6279 |
15.0412 | 166.0 | 1328 | 6.6498 |
15.0258 | 167.0 | 1336 | 6.6603 |
14.9784 | 168.0 | 1344 | 6.6658 |
14.8854 | 169.0 | 1352 | 6.6881 |
14.8676 | 170.0 | 1360 | 6.6862 |
14.7526 | 171.0 | 1368 | 6.6944 |
14.7146 | 172.0 | 1376 | 6.7082 |
14.7218 | 173.0 | 1384 | 6.7177 |
14.6535 | 174.0 | 1392 | 6.7232 |
14.5767 | 175.0 | 1400 | 6.7316 |
14.4947 | 176.0 | 1408 | 6.7440 |
14.5144 | 177.0 | 1416 | 6.7538 |
14.4185 | 178.0 | 1424 | 6.7655 |
14.3 | 179.0 | 1432 | 6.7633 |
14.3174 | 180.0 | 1440 | 6.7894 |
14.191 | 181.0 | 1448 | 6.7982 |
14.143 | 182.0 | 1456 | 6.8087 |
14.0423 | 183.0 | 1464 | 6.8203 |
14.018 | 184.0 | 1472 | 6.8300 |
14.0147 | 185.0 | 1480 | 6.8352 |
13.8292 | 186.0 | 1488 | 6.8433 |
13.8796 | 187.0 | 1496 | 6.8489 |
13.838 | 188.0 | 1504 | 6.8627 |
13.7829 | 189.0 | 1512 | 6.8665 |
13.7829 | 190.0 | 1520 | 6.8765 |
13.6909 | 191.0 | 1528 | 6.8911 |
13.6776 | 192.0 | 1536 | 6.8976 |
13.5998 | 193.0 | 1544 | 6.9042 |
13.5913 | 194.0 | 1552 | 6.9140 |
13.3959 | 195.0 | 1560 | 6.9231 |
13.4454 | 196.0 | 1568 | 6.9237 |
13.4054 | 197.0 | 1576 | 6.9486 |
13.4084 | 198.0 | 1584 | 6.9489 |
13.2953 | 199.0 | 1592 | 6.9499 |
13.1916 | 200.0 | 1600 | 6.9641 |
13.1678 | 201.0 | 1608 | 6.9675 |
13.2026 | 202.0 | 1616 | 6.9851 |
13.1273 | 203.0 | 1624 | 6.9830 |
13.0846 | 204.0 | 1632 | 6.9894 |
13.0633 | 205.0 | 1640 | 7.0006 |
13.0592 | 206.0 | 1648 | 7.0150 |
12.9165 | 207.0 | 1656 | 7.0081 |
12.9007 | 208.0 | 1664 | 7.0187 |
12.8814 | 209.0 | 1672 | 7.0224 |
12.8298 | 210.0 | 1680 | 7.0354 |
12.8303 | 211.0 | 1688 | 7.0361 |
12.713 | 212.0 | 1696 | 7.0449 |
12.706 | 213.0 | 1704 | 7.0602 |
12.7231 | 214.0 | 1712 | 7.0602 |
12.6166 | 215.0 | 1720 | 7.0736 |
12.5342 | 216.0 | 1728 | 7.0808 |
12.534 | 217.0 | 1736 | 7.0942 |
12.5529 | 218.0 | 1744 | 7.0907 |
12.5035 | 219.0 | 1752 | 7.0909 |
12.4415 | 220.0 | 1760 | 7.1029 |
12.4305 | 221.0 | 1768 | 7.1177 |
12.3753 | 222.0 | 1776 | 7.1214 |
12.3459 | 223.0 | 1784 | 7.1190 |
12.4079 | 224.0 | 1792 | 7.1317 |
12.2867 | 225.0 | 1800 | 7.1421 |
12.2105 | 226.0 | 1808 | 7.1439 |
12.1905 | 227.0 | 1816 | 7.1524 |
12.2438 | 228.0 | 1824 | 7.1568 |
12.2483 | 229.0 | 1832 | 7.1593 |
12.1855 | 230.0 | 1840 | 7.1681 |
12.0293 | 231.0 | 1848 | 7.1758 |
12.0954 | 232.0 | 1856 | 7.1741 |
12.0445 | 233.0 | 1864 | 7.1902 |
11.9761 | 234.0 | 1872 | 7.1862 |
11.8848 | 235.0 | 1880 | 7.1927 |
11.9529 | 236.0 | 1888 | 7.1979 |
11.885 | 237.0 | 1896 | 7.2046 |
11.7895 | 238.0 | 1904 | 7.2161 |
11.7884 | 239.0 | 1912 | 7.2123 |
11.779 | 240.0 | 1920 | 7.2259 |
11.8171 | 241.0 | 1928 | 7.2221 |
11.7989 | 242.0 | 1936 | 7.2265 |
11.7214 | 243.0 | 1944 | 7.2291 |
11.7687 | 244.0 | 1952 | 7.2418 |
11.7037 | 245.0 | 1960 | 7.2363 |
11.6243 | 246.0 | 1968 | 7.2472 |
11.7001 | 247.0 | 1976 | 7.2602 |
11.6701 | 248.0 | 1984 | 7.2492 |
11.6172 | 249.0 | 1992 | 7.2614 |
11.6292 | 250.0 | 2000 | 7.2619 |
11.5318 | 251.0 | 2008 | 7.2705 |
11.5071 | 252.0 | 2016 | 7.2762 |
11.4715 | 253.0 | 2024 | 7.2762 |
11.4828 | 254.0 | 2032 | 7.2840 |
11.4232 | 255.0 | 2040 | 7.2845 |
11.4747 | 256.0 | 2048 | 7.2895 |
11.3745 | 257.0 | 2056 | 7.2940 |
11.3526 | 258.0 | 2064 | 7.2957 |
11.3843 | 259.0 | 2072 | 7.3001 |
11.3224 | 260.0 | 2080 | 7.3046 |
11.3156 | 261.0 | 2088 | 7.3082 |
11.3194 | 262.0 | 2096 | 7.3134 |
11.2449 | 263.0 | 2104 | 7.3191 |
11.2619 | 264.0 | 2112 | 7.3185 |
11.2628 | 265.0 | 2120 | 7.3283 |
11.2272 | 266.0 | 2128 | 7.3375 |
11.1958 | 267.0 | 2136 | 7.3299 |
11.1988 | 268.0 | 2144 | 7.3399 |
11.1489 | 269.0 | 2152 | 7.3462 |
11.1609 | 270.0 | 2160 | 7.3442 |
11.102 | 271.0 | 2168 | 7.3518 |
11.1255 | 272.0 | 2176 | 7.3539 |
11.0609 | 273.0 | 2184 | 7.3515 |
11.1389 | 274.0 | 2192 | 7.3631 |
11.0255 | 275.0 | 2200 | 7.3614 |
11.0338 | 276.0 | 2208 | 7.3632 |
10.9559 | 277.0 | 2216 | 7.3717 |
10.966 | 278.0 | 2224 | 7.3656 |
10.9771 | 279.0 | 2232 | 7.3719 |
10.9786 | 280.0 | 2240 | 7.3738 |
10.9463 | 281.0 | 2248 | 7.3744 |
10.8997 | 282.0 | 2256 | 7.3783 |
10.8937 | 283.0 | 2264 | 7.3798 |
10.8832 | 284.0 | 2272 | 7.3848 |
10.8433 | 285.0 | 2280 | 7.3872 |
10.9294 | 286.0 | 2288 | 7.3907 |
10.7831 | 287.0 | 2296 | 7.3915 |
10.8367 | 288.0 | 2304 | 7.3944 |
10.8091 | 289.0 | 2312 | 7.4008 |
10.8585 | 290.0 | 2320 | 7.3947 |
10.7879 | 291.0 | 2328 | 7.4077 |
10.767 | 292.0 | 2336 | 7.4011 |
10.8139 | 293.0 | 2344 | 7.4059 |
10.7572 | 294.0 | 2352 | 7.4066 |
10.7829 | 295.0 | 2360 | 7.4123 |
10.7285 | 296.0 | 2368 | 7.4130 |
10.7642 | 297.0 | 2376 | 7.4128 |
10.7138 | 298.0 | 2384 | 7.4188 |
10.7285 | 299.0 | 2392 | 7.4162 |
10.7366 | 300.0 | 2400 | 7.4211 |
10.7125 | 301.0 | 2408 | 7.4212 |
10.6999 | 302.0 | 2416 | 7.4237 |
10.6657 | 303.0 | 2424 | 7.4270 |
10.6579 | 304.0 | 2432 | 7.4259 |
10.7066 | 305.0 | 2440 | 7.4298 |
10.6463 | 306.0 | 2448 | 7.4331 |
10.683 | 307.0 | 2456 | 7.4295 |
10.6238 | 308.0 | 2464 | 7.4371 |
10.6731 | 309.0 | 2472 | 7.4378 |
10.6708 | 310.0 | 2480 | 7.4341 |
10.6065 | 311.0 | 2488 | 7.4432 |
10.5891 | 312.0 | 2496 | 7.4401 |
10.581 | 313.0 | 2504 | 7.4412 |
10.564 | 314.0 | 2512 | 7.4472 |
10.6173 | 315.0 | 2520 | 7.4462 |
10.6134 | 316.0 | 2528 | 7.4493 |
10.5954 | 317.0 | 2536 | 7.4517 |
10.5617 | 318.0 | 2544 | 7.4524 |
10.5486 | 319.0 | 2552 | 7.4545 |
10.5362 | 320.0 | 2560 | 7.4524 |
10.5081 | 321.0 | 2568 | 7.4523 |
10.5617 | 322.0 | 2576 | 7.4552 |
10.521 | 323.0 | 2584 | 7.4573 |
10.5199 | 324.0 | 2592 | 7.4553 |
10.5411 | 325.0 | 2600 | 7.4570 |
10.5266 | 326.0 | 2608 | 7.4552 |
10.5189 | 327.0 | 2616 | 7.4620 |
10.5166 | 328.0 | 2624 | 7.4585 |
10.454 | 329.0 | 2632 | 7.4612 |
10.5651 | 330.0 | 2640 | 7.4632 |
10.5066 | 331.0 | 2648 | 7.4632 |
10.5266 | 332.0 | 2656 | 7.4614 |
10.5283 | 333.0 | 2664 | 7.4652 |
10.4858 | 334.0 | 2672 | 7.4663 |
10.4521 | 335.0 | 2680 | 7.4663 |
10.4572 | 336.0 | 2688 | 7.4655 |
10.5046 | 337.0 | 2696 | 7.4665 |
10.4653 | 338.0 | 2704 | 7.4674 |
10.4872 | 339.0 | 2712 | 7.4699 |
10.4266 | 340.0 | 2720 | 7.4682 |
10.4524 | 341.0 | 2728 | 7.4707 |
10.4534 | 342.0 | 2736 | 7.4708 |
10.4305 | 343.0 | 2744 | 7.4715 |
10.4627 | 344.0 | 2752 | 7.4713 |
10.4557 | 345.0 | 2760 | 7.4736 |
10.3853 | 346.0 | 2768 | 7.4724 |
10.4317 | 347.0 | 2776 | 7.4720 |
10.3978 | 348.0 | 2784 | 7.4729 |
10.4376 | 349.0 | 2792 | 7.4746 |
10.4341 | 350.0 | 2800 | 7.4734 |
10.4245 | 351.0 | 2808 | 7.4735 |
10.415 | 352.0 | 2816 | 7.4741 |
10.3609 | 353.0 | 2824 | 7.4739 |
10.4545 | 354.0 | 2832 | 7.4749 |
10.4374 | 355.0 | 2840 | 7.4748 |
10.4411 | 356.0 | 2848 | 7.4753 |
10.3917 | 357.0 | 2856 | 7.4759 |
10.4483 | 358.0 | 2864 | 7.4765 |
10.3699 | 359.0 | 2872 | 7.4759 |
10.4319 | 360.0 | 2880 | 7.4760 |
10.3863 | 361.0 | 2888 | 7.4761 |
10.35 | 362.0 | 2896 | 7.4759 |
10.4225 | 363.0 | 2904 | 7.4756 |
10.4223 | 364.0 | 2912 | 7.4758 |
10.4389 | 365.0 | 2920 | 7.4760 |
10.469 | 366.0 | 2928 | 7.4764 |
10.4033 | 367.0 | 2936 | 7.4765 |
10.3862 | 368.0 | 2944 | 7.4765 |
10.4426 | 369.0 | 2952 | 7.4765 |
10.4102 | 370.0 | 2960 | 7.4766 |
10.3714 | 371.0 | 2968 | 7.4765 |
10.4194 | 372.0 | 2976 | 7.4765 |
10.3912 | 373.0 | 2984 | 7.4765 |
10.359 | 374.0 | 2992 | 7.4765 |
10.3752 | 375.0 | 3000 | 7.4765 |
Framework versions
- Transformers 4.49.0
- Pytorch 2.4.0+cu121
- Datasets 3.4.0
- Tokenizers 0.21.0
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support