Configuration Parsing Warning: In UNKNOWN_FILENAME: "auto_map.AutoTokenizer" must be a string

flan-t5laa-large

This model is a fine-tuned version of hrezaei/flan-t5laa-large on the HuggingFaceFW/fineweb sample-350BT dataset. It achieves the following results on the evaluation set:

  • Perplexity: 1.1522
  • Loss: 0.1417
  • Accuracy: 0.0025
  • Lookahead Perplexity: 524.0285
  • Lookahead Loss: 6.2615
  • Base Perplexity: 1.1386
  • Base Loss: 0.1298

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • training_steps: 524288

Training results

Training Loss Epoch Step Accuracy Base Loss Base Perplexity Lookahead Loss Lookahead Perplexity Validation Loss Perplexity
0.3249 0.0095 5000 0.0025 0.1298 1.1386 9.1628 9536.0974 0.1473 1.1587
0.3149 0.0191 10000 0.0025 0.1298 1.1386 8.2987 4018.5587 0.1457 1.1568
0.3455 0.0286 15000 0.0025 0.1298 1.1386 7.8543 2576.7294 0.1448 1.1558
0.3164 0.0381 20000 0.0025 0.1298 1.1386 7.6043 2006.8105 0.1443 1.1552
0.3412 0.0477 25000 0.0025 0.1298 1.1386 7.4405 1703.6775 0.1440 1.1549
0.3334 0.0572 30000 0.0025 0.1298 1.1386 7.3224 1513.8265 0.1438 1.1546
0.3182 0.0668 35000 0.0025 0.1298 1.1386 7.2325 1383.6223 0.1436 1.1544
0.3193 0.0763 40000 0.0025 0.1298 1.1386 7.1599 1286.7995 0.1434 1.1542
0.3112 0.0858 45000 0.0025 0.1298 1.1386 7.1003 1212.3711 0.1433 1.1541
0.3084 0.0954 50000 0.0025 0.1298 1.1386 7.0527 1155.9677 0.1432 1.1540
0.3281 0.1049 55000 0.0025 0.1298 1.1386 7.0097 1107.3615 0.1431 1.1539
0.3096 0.1144 60000 0.0025 0.1298 1.1386 6.9716 1065.9302 0.1431 1.1538
0.3168 1.0048 65000 0.0025 0.1298 1.1386 6.9373 1029.9387 0.1430 1.1537
0.3158 1.0143 70000 0.0025 0.1298 1.1386 6.9058 998.0076 0.1429 1.1536
0.3149 1.0238 75000 0.0025 0.1298 1.1386 6.8755 968.2614 0.1429 1.1536
0.3185 1.0334 80000 0.0025 0.1298 1.1386 6.8495 943.4458 0.1428 1.1535
0.3247 1.0429 85000 0.0025 0.1298 1.1386 6.8251 920.7070 0.1428 1.1535
0.3166 1.0525 90000 0.0025 0.1298 1.1386 6.8033 900.7799 0.1427 1.1534
0.3171 1.0620 95000 0.0025 0.1298 1.1386 6.7802 880.2640 0.1427 1.1534
0.3247 1.0715 100000 0.0025 0.1298 1.1386 6.7589 861.6873 0.1426 1.1533
0.3199 1.0095 105000 0.0025 0.1298 1.1386 6.7393 844.9302 0.1426 1.1533
0.3116 1.0191 110000 0.0025 0.1298 1.1386 6.7202 828.9964 0.1426 1.1532
0.3431 1.0286 115000 0.0025 0.1298 1.1386 6.7028 814.7008 0.1425 1.1532
0.3145 1.0381 120000 0.0025 0.1298 1.1386 6.6864 801.3963 0.1425 1.1531
0.3396 1.0477 125000 0.0025 0.1298 1.1386 6.6714 789.4998 0.1425 1.1531
0.332 1.0572 130000 0.0025 0.1298 1.1386 6.6562 777.5850 0.1424 1.1531
0.3169 1.0668 135000 0.0025 0.1298 1.1386 6.6411 765.9321 0.1424 1.1530
0.3183 1.0763 140000 0.0025 0.1298 1.1386 6.6259 754.4168 0.1424 1.1530
0.3102 1.0858 145000 0.0025 0.1298 1.1386 6.6117 743.7487 0.1423 1.1530
0.3075 1.0954 150000 0.0025 0.1298 1.1386 6.6002 735.2481 0.1423 1.1530
0.3272 1.1049 155000 0.0025 0.1298 1.1386 6.5881 726.3988 0.1423 1.1529
0.3088 1.1144 160000 0.0025 0.1298 1.1386 6.5765 717.9999 0.1423 1.1529
0.316 2.0048 165000 0.0025 0.1298 1.1386 6.5648 709.6853 0.1423 1.1529
0.315 2.0143 170000 0.0025 0.1298 1.1386 6.5536 701.7924 0.1422 1.1528
0.3142 2.0238 175000 0.0025 0.1298 1.1386 6.5417 693.4763 0.1422 1.1528
0.3178 2.0334 180000 0.0025 0.1298 1.1386 6.5319 686.6713 0.1422 1.1528
0.324 2.0429 185000 0.0025 0.1298 1.1386 6.5221 680.0125 0.1422 1.1528
0.316 2.0525 190000 0.0025 0.1298 1.1386 6.5135 674.1869 0.1422 1.1528
0.3165 2.0620 195000 0.0025 0.1298 1.1386 6.5032 667.2772 0.1421 1.1527
0.3241 2.0715 200000 0.0025 0.1298 1.1386 6.4936 660.9243 0.1421 1.1527
0.3193 1.0095 205000 0.0025 0.1298 1.1386 6.4847 655.0490 0.1421 1.1527
0.3111 1.0191 210000 0.0025 0.1298 1.1386 6.4758 649.2363 0.1421 1.1527
0.3426 1.0286 215000 0.0025 0.1298 1.1386 6.4675 643.8971 0.1421 1.1527
0.314 1.0381 220000 0.0025 0.1298 1.1386 6.4598 638.9146 0.1420 1.1526
0.3391 1.0477 225000 0.0025 0.1298 1.1386 6.4527 634.4029 0.1420 1.1526
0.3315 1.0572 230000 0.0025 0.1298 1.1386 6.4453 629.7214 0.1420 1.1526
0.3165 1.0668 235000 0.0025 0.1298 1.1386 6.4377 624.9499 0.1420 1.1526
0.3179 1.0763 240000 0.0025 0.1298 1.1386 6.4298 620.0502 0.1420 1.1526
0.3098 1.0858 245000 0.0025 0.1298 1.1386 6.4223 615.4145 0.1420 1.1526
0.3071 1.0954 250000 0.0025 0.1298 1.1386 6.4166 611.9283 0.1420 1.1525
0.3269 1.1049 255000 0.0025 0.1298 1.1386 6.4104 608.1609 0.1420 1.1525
0.3085 1.1144 260000 0.0025 0.1298 1.1386 6.4045 604.5368 0.1419 1.1525
0.3156 2.0048 265000 0.0025 0.1298 1.1386 6.3983 600.8226 0.1419 1.1525
0.3147 2.0143 270000 0.0025 0.1298 1.1386 6.3924 597.3057 0.1419 1.1525
0.3139 2.0238 275000 0.0025 0.1298 1.1386 6.3859 593.4073 0.1419 1.1525
0.3175 2.0334 280000 0.0025 0.1298 1.1386 6.3807 590.3181 0.1419 1.1525
0.3237 2.0429 285000 0.0025 0.1298 1.1386 6.3755 587.2825 0.1419 1.1524
0.3157 2.0525 290000 0.0025 0.1298 1.1386 6.3711 584.7022 0.1419 1.1524
0.3162 2.0620 295000 0.0025 0.1298 1.1386 6.3654 581.4033 0.1419 1.1524
0.3238 2.0715 300000 0.0025 0.1298 1.1386 6.3603 578.4126 0.1419 1.1524
0.319 1.0095 305000 0.0025 0.1298 1.1386 6.3554 575.6083 0.1418 1.1524
0.3108 1.0191 310000 0.0025 0.1298 1.1386 6.3506 572.8332 0.1418 1.1524
0.3424 1.0286 315000 0.0025 0.1298 1.1386 6.3461 570.2490 0.1418 1.1524
0.3137 1.0381 320000 0.0025 0.1298 1.1386 6.3419 567.8765 0.1418 1.1524
0.3389 1.0477 325000 0.0025 0.1298 1.1386 6.3381 565.7363 0.1418 1.1524
0.3313 1.0572 330000 0.0025 0.1298 1.1386 6.3342 563.5154 0.1418 1.1524
0.3163 1.0668 335000 0.0025 0.1298 1.1386 6.3301 561.2170 0.1418 1.1523
0.3177 1.0763 340000 0.0025 0.1298 1.1386 6.3258 558.8315 0.1418 1.1523
0.3096 1.0858 345000 0.0025 0.1298 1.1386 6.3217 556.5380 0.1418 1.1523
0.3069 1.0954 350000 0.0025 0.1298 1.1386 6.3188 554.8901 0.1418 1.1523
0.3267 1.1049 355000 0.0025 0.1298 1.1386 6.3155 553.0973 0.1418 1.1523
0.3083 1.1144 360000 0.0025 0.1298 1.1386 6.3124 551.3773 0.1418 1.1523
0.3154 2.0048 365000 0.0025 0.1298 1.1386 6.3092 549.6034 0.1418 1.1523
0.3145 2.0143 370000 0.0025 0.1298 1.1386 6.3062 547.9500 0.1417 1.1523
0.3137 2.0238 375000 0.0025 0.1298 1.1386 6.3028 546.0782 0.1417 1.1523
0.3173 2.0334 380000 0.0025 0.1298 1.1386 6.3001 544.6261 0.1417 1.1523
0.3235 2.0429 385000 0.0025 0.1298 1.1386 6.2975 543.2184 0.1417 1.1523
0.3155 2.0525 390000 0.0025 0.1298 1.1386 6.2954 542.0614 0.1417 1.1523
0.3161 2.0620 395000 0.0025 0.1298 1.1386 6.2926 540.5324 0.1417 1.1523
0.3237 2.0715 400000 0.0025 0.1298 1.1386 6.2901 539.1868 0.1417 1.1523
0.3162 1.0095 405000 1.1522 0.1417 0.0025 537.9444 6.2878 1.1386 0.1298
0.31 1.0191 410000 1.1522 0.1417 0.0025 536.6766 6.2854 1.1386 0.1298
0.3412 1.0286 415000 1.1522 0.1417 0.0025 535.5226 6.2832 1.1386 0.1298
0.3148 1.0381 420000 1.1522 0.1417 0.0025 534.5032 6.2813 1.1386 0.1298
0.3387 1.0477 425000 1.1522 0.1417 0.0025 533.6123 6.2797 1.1386 0.1298
0.3316 1.0572 430000 1.1522 0.1417 0.0025 532.7114 6.2780 1.1386 0.1298
0.3168 1.0668 435000 1.1522 0.1417 0.0025 531.7269 6.2761 1.1386 0.1298
0.3173 1.0763 440000 1.1522 0.1417 0.0025 530.8253 6.2744 1.1386 0.1298
0.3098 1.0858 445000 1.1522 0.1417 0.0025 529.8949 6.2727 1.1386 0.1298
0.3078 1.0954 450000 1.1522 0.1417 0.0025 529.2726 6.2715 1.1386 0.1298
0.3275 1.1049 455000 1.1522 0.1417 0.0025 528.6246 6.2703 1.1386 0.1298
0.308 1.1144 460000 1.1522 0.1417 0.0025 528.0456 6.2692 1.1386 0.1298
0.3159 2.0048 465000 1.1522 0.1417 0.0025 527.4365 6.2680 1.1386 0.1298
0.3115 2.0143 470000 1.1522 0.1417 0.0025 526.8853 6.2670 1.1386 0.1298
0.3137 2.0238 475000 1.1522 0.1417 0.0025 526.2984 6.2659 1.1386 0.1298
0.3146 2.0334 480000 1.1522 0.1417 0.0025 525.8735 6.2651 1.1386 0.1298
0.323 2.0429 485000 1.1522 0.1417 0.0025 525.4838 6.2643 1.1386 0.1298
0.3177 2.0525 490000 1.1522 0.1417 0.0025 525.1886 6.2638 1.1386 0.1298
0.3147 2.0620 495000 1.1522 0.1417 0.0025 524.8575 6.2631 1.1386 0.1298
0.3257 2.0715 500000 1.1522 0.1417 0.0025 524.6004 6.2626 1.1386 0.1298
0.3387 2.0811 505000 1.1522 0.1417 0.0025 524.3688 6.2622 1.1386 0.1298
0.3118 2.0906 510000 1.1522 0.1417 0.0025 524.1995 6.2619 1.1386 0.1298
0.3147 2.1001 515000 1.1522 0.1417 0.0025 524.1102 6.2617 1.1386 0.1298
0.3287 2.1097 520000 1.1522 0.1417 0.0025 524.0447 6.2616 1.1386 0.1298

Framework versions

  • Transformers 4.57.0.dev0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
173
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hrezaei/flan-t5laa-large

Unable to build the model tree, the base model loops to the model itself. Learn more.

Dataset used to train hrezaei/flan-t5laa-large

Evaluation results