impossible-llms-spanish-natural

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.8184

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 12
  • eval_batch_size: 8
  • seed: 0
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 384
  • total_eval_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 3000
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss
51.7482 1.0 8 10.1009
47.1136 2.0 16 9.3242
45.0853 3.0 24 8.9890
44.3998 4.0 32 8.8363
43.4616 5.0 40 8.6399
42.535 6.0 48 8.4361
41.3844 7.0 56 8.2123
40.0352 8.0 64 7.9688
39.0033 9.0 72 7.7402
37.8514 10.0 80 7.5173
36.5612 11.0 88 7.2928
35.4609 12.0 96 7.0676
34.5723 13.0 104 6.8513
33.2546 14.0 112 6.6514
32.4976 15.0 120 6.4657
31.7763 16.0 128 6.3143
31.025 17.0 136 6.1922
30.5242 18.0 144 6.1013
30.0561 19.0 152 6.0031
29.5901 20.0 160 5.9325
29.4545 21.0 168 5.8617
28.9811 22.0 176 5.8039
28.8831 23.0 184 5.7678
28.4813 24.0 192 5.7255
28.2451 25.0 200 5.6767
28.1487 26.0 208 5.6355
27.8517 27.0 216 5.6016
27.7778 28.0 224 5.5588
27.5518 29.0 232 5.5329
27.4451 30.0 240 5.5022
27.3166 31.0 248 5.4869
27.0987 32.0 256 5.4511
26.7841 33.0 264 5.4280
26.8971 34.0 272 5.4120
26.3692 35.0 280 5.3898
26.4659 36.0 288 5.3737
26.2954 37.0 296 5.3614
26.2919 38.0 304 5.3355
26.1314 39.0 312 5.3200
25.9263 40.0 320 5.3024
25.935 41.0 328 5.2871
25.6915 42.0 336 5.2826
25.4353 43.0 344 5.2606
25.5488 44.0 352 5.2499
25.4338 45.0 360 5.2350
25.2367 46.0 368 5.2183
25.2789 47.0 376 5.2069
24.9316 48.0 384 5.1947
24.6801 49.0 392 5.1896
24.8224 50.0 400 5.1728
24.6572 51.0 408 5.1657
24.3244 52.0 416 5.1515
24.412 53.0 424 5.1430
24.1812 54.0 432 5.1323
23.9904 55.0 440 5.1222
24.0799 56.0 448 5.1179
23.9543 57.0 456 5.1072
23.7952 58.0 464 5.0991
23.5379 59.0 472 5.0910
23.4634 60.0 480 5.0932
23.335 61.0 488 5.0834
23.2467 62.0 496 5.0836
23.1341 63.0 504 5.0820
23.1521 64.0 512 5.0685
22.7104 65.0 520 5.0679
22.6608 66.0 528 5.0670
22.6397 67.0 536 5.0661
22.422 68.0 544 5.0739
22.4531 69.0 552 5.0714
22.2173 70.0 560 5.0637
22.2348 71.0 568 5.0732
22.0378 72.0 576 5.0730
22.0102 73.0 584 5.0752
21.7305 74.0 592 5.0790
21.702 75.0 600 5.0820
21.7926 76.0 608 5.0809
21.554 77.0 616 5.0860
21.5144 78.0 624 5.0952
21.1927 79.0 632 5.1032
21.0842 80.0 640 5.1086
21.088 81.0 648 5.1133
20.8599 82.0 656 5.1215
20.7647 83.0 664 5.1244
20.7258 84.0 672 5.1338
20.6174 85.0 680 5.1403
20.4121 86.0 688 5.1380
20.3097 87.0 696 5.1572
20.2016 88.0 704 5.1605
20.0856 89.0 712 5.1729
20.0558 90.0 720 5.1826
19.8661 91.0 728 5.2008
19.6604 92.0 736 5.1983
19.7707 93.0 744 5.2132
19.5645 94.0 752 5.2248
19.563 95.0 760 5.2305
19.356 96.0 768 5.2506
19.3115 97.0 776 5.2607
19.317 98.0 784 5.2729
18.9184 99.0 792 5.2840
18.9451 100.0 800 5.2899
18.9796 101.0 808 5.2987
18.8531 102.0 816 5.3146
18.5326 103.0 824 5.3257
18.6404 104.0 832 5.3395
18.5631 105.0 840 5.3454
18.375 106.0 848 5.3609
18.3254 107.0 856 5.3662
18.2828 108.0 864 5.3731
17.9908 109.0 872 5.3853
17.8628 110.0 880 5.4045
17.8125 111.0 888 5.4169
17.7775 112.0 896 5.4288
17.6947 113.0 904 5.4396
17.5524 114.0 912 5.4574
17.6086 115.0 920 5.4649
17.5326 116.0 928 5.4638
17.3661 117.0 936 5.4946
17.2628 118.0 944 5.5055
17.1431 119.0 952 5.5055
17.1413 120.0 960 5.5349
17.0927 121.0 968 5.5345
16.8162 122.0 976 5.5575
16.6914 123.0 984 5.5597
16.6333 124.0 992 5.5778
16.6871 125.0 1000 5.5835
16.5154 126.0 1008 5.5979
16.5725 127.0 1016 5.6137
16.3883 128.0 1024 5.6261
16.3495 129.0 1032 5.6381
16.1944 130.0 1040 5.6494
16.1893 131.0 1048 5.6676
16.1091 132.0 1056 5.6876
15.9974 133.0 1064 5.6895
15.9566 134.0 1072 5.6994
15.7183 135.0 1080 5.7051
15.7345 136.0 1088 5.7222
15.7121 137.0 1096 5.7261
15.6825 138.0 1104 5.7487
15.5537 139.0 1112 5.7612
15.4367 140.0 1120 5.7742
15.467 141.0 1128 5.7805
15.3692 142.0 1136 5.7981
15.3071 143.0 1144 5.8083
15.2233 144.0 1152 5.8245
15.0736 145.0 1160 5.8318
15.0534 146.0 1168 5.8394
14.9743 147.0 1176 5.8561
14.8758 148.0 1184 5.8611
14.8195 149.0 1192 5.8811
14.7533 150.0 1200 5.8840
14.6149 151.0 1208 5.8999
14.6015 152.0 1216 5.9174
14.5694 153.0 1224 5.9121
14.4739 154.0 1232 5.9283
14.5006 155.0 1240 5.9400
14.3506 156.0 1248 5.9536
14.3135 157.0 1256 5.9643
14.2873 158.0 1264 5.9644
14.1698 159.0 1272 5.9845
14.159 160.0 1280 5.9881
14.0499 161.0 1288 6.0032
13.9969 162.0 1296 6.0161
13.9918 163.0 1304 6.0177
13.989 164.0 1312 6.0409
13.8361 165.0 1320 6.0419
13.8109 166.0 1328 6.0592
13.7056 167.0 1336 6.0604
13.6535 168.0 1344 6.0735
13.6727 169.0 1352 6.0859
13.4892 170.0 1360 6.0983
13.5187 171.0 1368 6.1006
13.416 172.0 1376 6.1236
13.4031 173.0 1384 6.1257
13.3737 174.0 1392 6.1196
13.3202 175.0 1400 6.1495
13.2018 176.0 1408 6.1524
13.1343 177.0 1416 6.1722
13.1208 178.0 1424 6.1700
13.068 179.0 1432 6.1792
12.9428 180.0 1440 6.1842
12.9316 181.0 1448 6.1961
12.966 182.0 1456 6.2035
12.9418 183.0 1464 6.2125
12.7829 184.0 1472 6.2231
12.796 185.0 1480 6.2370
12.75 186.0 1488 6.2376
12.6302 187.0 1496 6.2485
12.6604 188.0 1504 6.2562
12.593 189.0 1512 6.2658
12.6442 190.0 1520 6.2673
12.4919 191.0 1528 6.2805
12.4591 192.0 1536 6.2892
12.4038 193.0 1544 6.3006
12.3682 194.0 1552 6.3081
12.4278 195.0 1560 6.3119
12.3047 196.0 1568 6.3168
12.2657 197.0 1576 6.3308
12.1921 198.0 1584 6.3365
12.1392 199.0 1592 6.3429
12.1538 200.0 1600 6.3506
12.0806 201.0 1608 6.3562
12.0888 202.0 1616 6.3606
12.0709 203.0 1624 6.3710
11.8803 204.0 1632 6.3793
11.9575 205.0 1640 6.3862
11.9468 206.0 1648 6.3876
11.9192 207.0 1656 6.3980
11.8949 208.0 1664 6.4026
11.8611 209.0 1672 6.4132
11.8237 210.0 1680 6.4116
11.7115 211.0 1688 6.4298
11.7199 212.0 1696 6.4204
11.6559 213.0 1704 6.4333
11.6339 214.0 1712 6.4446
11.6064 215.0 1720 6.4457
11.5941 216.0 1728 6.4505
11.5943 217.0 1736 6.4660
11.5027 218.0 1744 6.4639
11.5226 219.0 1752 6.4743
11.4428 220.0 1760 6.4731
11.3947 221.0 1768 6.4823
11.4507 222.0 1776 6.4905
11.3622 223.0 1784 6.4986
11.3231 224.0 1792 6.5073
11.2868 225.0 1800 6.5093
11.2595 226.0 1808 6.5114
11.2187 227.0 1816 6.5212
11.2437 228.0 1824 6.5231
11.2354 229.0 1832 6.5265
11.2185 230.0 1840 6.5331
11.1903 231.0 1848 6.5374
11.1166 232.0 1856 6.5424
11.1199 233.0 1864 6.5454
11.092 234.0 1872 6.5530
11.1067 235.0 1880 6.5634
11.0527 236.0 1888 6.5692
11.0075 237.0 1896 6.5730
10.9491 238.0 1904 6.5732
10.9812 239.0 1912 6.5781
10.8566 240.0 1920 6.5769
10.9052 241.0 1928 6.5863
10.9273 242.0 1936 6.5914
10.8682 243.0 1944 6.5963
10.8243 244.0 1952 6.5998
10.8447 245.0 1960 6.6061
10.8102 246.0 1968 6.6071
10.7664 247.0 1976 6.6137
10.8312 248.0 1984 6.6259
10.6884 249.0 1992 6.6233
10.7115 250.0 2000 6.6312
10.7217 251.0 2008 6.6256
10.6847 252.0 2016 6.6327
10.6836 253.0 2024 6.6431
10.637 254.0 2032 6.6433
10.5986 255.0 2040 6.6483
10.5909 256.0 2048 6.6469
10.5634 257.0 2056 6.6528
10.4946 258.0 2064 6.6533
10.5208 259.0 2072 6.6567
10.4965 260.0 2080 6.6610
10.4934 261.0 2088 6.6725
10.4299 262.0 2096 6.6749
10.4986 263.0 2104 6.6818
10.4502 264.0 2112 6.6766
10.4203 265.0 2120 6.6822
10.4052 266.0 2128 6.6795
10.4339 267.0 2136 6.6875
10.3813 268.0 2144 6.6916
10.3406 269.0 2152 6.6874
10.3575 270.0 2160 6.6944
10.3259 271.0 2168 6.7024
10.3898 272.0 2176 6.7016
10.2788 273.0 2184 6.7059
10.2602 274.0 2192 6.7072
10.2348 275.0 2200 6.7081
10.2383 276.0 2208 6.7134
10.316 277.0 2216 6.7189
10.2411 278.0 2224 6.7181
10.2012 279.0 2232 6.7194
10.1992 280.0 2240 6.7229
10.1965 281.0 2248 6.7239
10.2203 282.0 2256 6.7193
10.1755 283.0 2264 6.7321
10.1564 284.0 2272 6.7296
10.1029 285.0 2280 6.7375
10.1323 286.0 2288 6.7398
10.0811 287.0 2296 6.7425
10.107 288.0 2304 6.7461
10.1342 289.0 2312 6.7479
10.1482 290.0 2320 6.7445
10.0679 291.0 2328 6.7457
10.0919 292.0 2336 6.7526
10.0352 293.0 2344 6.7521
10.0986 294.0 2352 6.7522
10.0199 295.0 2360 6.7572
9.9978 296.0 2368 6.7582
10.0085 297.0 2376 6.7607
9.9583 298.0 2384 6.7653
10.058 299.0 2392 6.7660
10.006 300.0 2400 6.7694
9.9744 301.0 2408 6.7687
9.9453 302.0 2416 6.7729
9.9084 303.0 2424 6.7744
9.9569 304.0 2432 6.7738
9.9564 305.0 2440 6.7771
9.9217 306.0 2448 6.7791
9.9595 307.0 2456 6.7833
9.9157 308.0 2464 6.7840
9.9505 309.0 2472 6.7788
9.8902 310.0 2480 6.7804
9.913 311.0 2488 6.7810
9.8953 312.0 2496 6.7854
9.8388 313.0 2504 6.7866
9.8951 314.0 2512 6.7877
9.8956 315.0 2520 6.7907
9.8355 316.0 2528 6.7902
9.9097 317.0 2536 6.7918
9.8304 318.0 2544 6.7911
9.8385 319.0 2552 6.7912
9.8874 320.0 2560 6.7946
9.8086 321.0 2568 6.7969
9.8169 322.0 2576 6.7963
9.7993 323.0 2584 6.7970
9.8324 324.0 2592 6.7978
9.8391 325.0 2600 6.7984
9.8765 326.0 2608 6.8019
9.7892 327.0 2616 6.8017
9.7902 328.0 2624 6.8017
9.8015 329.0 2632 6.8038
9.8687 330.0 2640 6.8047
9.7813 331.0 2648 6.8049
9.7734 332.0 2656 6.8059
9.8321 333.0 2664 6.8087
9.7921 334.0 2672 6.8071
9.7955 335.0 2680 6.8085
9.7474 336.0 2688 6.8057
9.7734 337.0 2696 6.8117
9.7687 338.0 2704 6.8099
9.7499 339.0 2712 6.8132
9.7645 340.0 2720 6.8108
9.7615 341.0 2728 6.8119
9.7293 342.0 2736 6.8126
9.7555 343.0 2744 6.8127
9.7456 344.0 2752 6.8131
9.7821 345.0 2760 6.8145
9.7462 346.0 2768 6.8143
9.8005 347.0 2776 6.8151
9.7558 348.0 2784 6.8156
9.7434 349.0 2792 6.8151
9.7548 350.0 2800 6.8159
9.718 351.0 2808 6.8175
9.7192 352.0 2816 6.8159
9.7142 353.0 2824 6.8166
9.7395 354.0 2832 6.8176
9.7601 355.0 2840 6.8173
9.7876 356.0 2848 6.8175
9.7267 357.0 2856 6.8177
9.8034 358.0 2864 6.8171
9.7382 359.0 2872 6.8170
9.7534 360.0 2880 6.8178
9.7423 361.0 2888 6.8182
9.7001 362.0 2896 6.8180
9.704 363.0 2904 6.8181
9.7615 364.0 2912 6.8186
9.7132 365.0 2920 6.8186
9.6365 366.0 2928 6.8184
9.7542 367.0 2936 6.8183
9.7524 368.0 2944 6.8183
9.7454 369.0 2952 6.8183
9.7615 370.0 2960 6.8182
9.7779 371.0 2968 6.8183
9.7548 372.0 2976 6.8183
9.7373 373.0 2984 6.8183
9.7027 374.0 2992 6.8184
9.7623 375.0 3000 6.8184

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.4.0+cu121
  • Datasets 3.4.0
  • Tokenizers 0.21.0
Downloads last month
1
Safetensors
Model size
126M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support