--- library_name: transformers license: other base_model: internlm/internlm2-math-plus-1_8b tags: - generated_from_trainer model-index: - name: tfa_output_2025_m05_d13_t13h_51m_30s results: [] --- # tfa_output_2025_m05_d13_t13h_51m_30s This model is a fine-tuned version of [internlm/internlm2-math-plus-1_8b](https://huggingface.co/internlm/internlm2-math-plus-1_8b) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 1.2781 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-07 - train_batch_size: 1 - eval_batch_size: 8 - seed: 42 - gradient_accumulation_steps: 8 - total_train_batch_size: 8 - optimizer: Use OptimizerNames.PAGED_ADAMW with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: constant_with_warmup - lr_scheduler_warmup_ratio: 0.05 - num_epochs: 1 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | No log | 0 | 0 | 1.2781 | | 2.6109 | 0.0049 | 25 | 1.2778 | | 2.6952 | 0.0098 | 50 | 1.2779 | | 2.8036 | 0.0147 | 75 | 1.2780 | | 2.8808 | 0.0196 | 100 | 1.2780 | | 3.0941 | 0.0245 | 125 | 1.2781 | | 2.4797 | 0.0294 | 150 | 1.2779 | | 2.7034 | 0.0343 | 175 | 1.2778 | | 3.0358 | 0.0392 | 200 | 1.2781 | | 2.6713 | 0.0441 | 225 | 1.2779 | | 2.8436 | 0.0490 | 250 | 1.2779 | | 2.8396 | 0.0539 | 275 | 1.2779 | | 2.8333 | 0.0588 | 300 | 1.2779 | | 2.9845 | 0.0637 | 325 | 1.2782 | | 2.6105 | 0.0686 | 350 | 1.2780 | | 2.8874 | 0.0735 | 375 | 1.2781 | | 2.626 | 0.0784 | 400 | 1.2779 | | 2.6437 | 0.0833 | 425 | 1.2779 | | 2.8082 | 0.0882 | 450 | 1.2781 | | 2.7265 | 0.0931 | 475 | 1.2781 | | 2.3798 | 0.0981 | 500 | 1.2779 | | 2.677 | 0.1030 | 525 | 1.2781 | | 2.6383 | 0.1079 | 550 | 1.2781 | | 2.7663 | 0.1128 | 575 | 1.2779 | | 2.6638 | 0.1177 | 600 | 1.2780 | | 2.7337 | 0.1226 | 625 | 1.2779 | | 2.801 | 0.1275 | 650 | 1.2781 | | 2.6107 | 0.1324 | 675 | 1.2781 | | 2.745 | 0.1373 | 700 | 1.2779 | | 2.7384 | 0.1422 | 725 | 1.2780 | | 2.7729 | 0.1471 | 750 | 1.2780 | | 2.9625 | 0.1520 | 775 | 1.2781 | | 2.4783 | 0.1569 | 800 | 1.2780 | | 3.0425 | 0.1618 | 825 | 1.2782 | | 2.6405 | 0.1667 | 850 | 1.2778 | | 2.9304 | 0.1716 | 875 | 1.2780 | | 2.9427 | 0.1765 | 900 | 1.2779 | | 2.5965 | 0.1814 | 925 | 1.2779 | | 2.8121 | 0.1863 | 950 | 1.2781 | | 2.6972 | 0.1912 | 975 | 1.2780 | | 2.7454 | 0.1961 | 1000 | 1.2780 | | 2.6769 | 0.2010 | 1025 | 1.2779 | | 3.0904 | 0.2059 | 1050 | 1.2779 | | 2.9542 | 0.2108 | 1075 | 1.2779 | | 2.8049 | 0.2157 | 1100 | 1.2780 | | 2.668 | 0.2206 | 1125 | 1.2780 | | 2.8104 | 0.2255 | 1150 | 1.2780 | | 2.5648 | 0.2304 | 1175 | 1.2780 | | 2.6159 | 0.2353 | 1200 | 1.2781 | | 2.8947 | 0.2402 | 1225 | 1.2781 | | 2.6045 | 0.2451 | 1250 | 1.2780 | | 2.767 | 0.2500 | 1275 | 1.2780 | | 2.9516 | 0.2549 | 1300 | 1.2778 | | 2.7013 | 0.2598 | 1325 | 1.2782 | | 2.9801 | 0.2647 | 1350 | 1.2779 | | 2.4982 | 0.2696 | 1375 | 1.2780 | | 3.023 | 0.2745 | 1400 | 1.2780 | | 2.7095 | 0.2794 | 1425 | 1.2780 | | 2.6959 | 0.2843 | 1450 | 1.2779 | | 2.5794 | 0.2893 | 1475 | 1.2779 | | 3.0798 | 0.2942 | 1500 | 1.2782 | | 2.5299 | 0.2991 | 1525 | 1.2781 | | 2.8247 | 0.3040 | 1550 | 1.2780 | | 2.6481 | 0.3089 | 1575 | 1.2778 | | 2.3977 | 0.3138 | 1600 | 1.2780 | | 2.626 | 0.3187 | 1625 | 1.2778 | | 2.8101 | 0.3236 | 1650 | 1.2780 | | 2.7166 | 0.3285 | 1675 | 1.2780 | | 2.9789 | 0.3334 | 1700 | 1.2779 | | 2.9734 | 0.3383 | 1725 | 1.2780 | | 2.6497 | 0.3432 | 1750 | 1.2781 | | 2.7752 | 0.3481 | 1775 | 1.2780 | | 3.0049 | 0.3530 | 1800 | 1.2778 | | 2.7946 | 0.3579 | 1825 | 1.2778 | | 2.7212 | 0.3628 | 1850 | 1.2780 | | 2.7503 | 0.3677 | 1875 | 1.2778 | | 2.6616 | 0.3726 | 1900 | 1.2781 | | 3.1099 | 0.3775 | 1925 | 1.2781 | | 2.7114 | 0.3824 | 1950 | 1.2781 | | 2.6648 | 0.3873 | 1975 | 1.2781 | | 2.8947 | 0.3922 | 2000 | 1.2780 | | 2.5636 | 0.3971 | 2025 | 1.2780 | | 2.618 | 0.4020 | 2050 | 1.2780 | | 2.6153 | 0.4069 | 2075 | 1.2779 | | 2.7458 | 0.4118 | 2100 | 1.2780 | | 2.896 | 0.4167 | 2125 | 1.2779 | | 2.9055 | 0.4216 | 2150 | 1.2781 | | 2.8312 | 0.4265 | 2175 | 1.2781 | | 2.6273 | 0.4314 | 2200 | 1.2781 | | 2.7673 | 0.4363 | 2225 | 1.2780 | | 2.887 | 0.4412 | 2250 | 1.2780 | | 2.7996 | 0.4461 | 2275 | 1.2780 | | 2.6026 | 0.4510 | 2300 | 1.2780 | | 2.8637 | 0.4559 | 2325 | 1.2779 | | 2.6673 | 0.4608 | 2350 | 1.2780 | | 2.7375 | 0.4657 | 2375 | 1.2780 | | 2.7014 | 0.4706 | 2400 | 1.2780 | | 3.0431 | 0.4755 | 2425 | 1.2780 | | 2.7895 | 0.4805 | 2450 | 1.2780 | | 2.5445 | 0.4854 | 2475 | 1.2781 | | 2.8042 | 0.4903 | 2500 | 1.2781 | | 2.4517 | 0.4952 | 2525 | 1.2779 | | 3.0145 | 0.5001 | 2550 | 1.2779 | | 2.8011 | 0.5050 | 2575 | 1.2779 | | 2.7895 | 0.5099 | 2600 | 1.2779 | | 2.8871 | 0.5148 | 2625 | 1.2779 | | 2.7724 | 0.5197 | 2650 | 1.2779 | | 2.5841 | 0.5246 | 2675 | 1.2780 | | 2.7891 | 0.5295 | 2700 | 1.2779 | | 2.9153 | 0.5344 | 2725 | 1.2779 | | 3.0127 | 0.5393 | 2750 | 1.2780 | | 2.8079 | 0.5442 | 2775 | 1.2778 | | 2.8522 | 0.5491 | 2800 | 1.2780 | | 2.6897 | 0.5540 | 2825 | 1.2779 | | 2.822 | 0.5589 | 2850 | 1.2777 | | 2.8534 | 0.5638 | 2875 | 1.2778 | | 2.5255 | 0.5687 | 2900 | 1.2778 | | 2.6427 | 0.5736 | 2925 | 1.2780 | | 3.0485 | 0.5785 | 2950 | 1.2782 | | 3.0283 | 0.5834 | 2975 | 1.2779 | | 2.9914 | 0.5883 | 3000 | 1.2780 | | 2.9151 | 0.5932 | 3025 | 1.2783 | | 2.397 | 0.5981 | 3050 | 1.2780 | | 2.8832 | 0.6030 | 3075 | 1.2780 | | 2.8657 | 0.6079 | 3100 | 1.2780 | | 2.5352 | 0.6128 | 3125 | 1.2779 | | 2.8679 | 0.6177 | 3150 | 1.2778 | | 2.6386 | 0.6226 | 3175 | 1.2780 | | 2.9986 | 0.6275 | 3200 | 1.2779 | | 2.842 | 0.6324 | 3225 | 1.2779 | | 2.6134 | 0.6373 | 3250 | 1.2780 | | 2.8062 | 0.6422 | 3275 | 1.2781 | | 2.8878 | 0.6471 | 3300 | 1.2780 | | 2.6385 | 0.6520 | 3325 | 1.2780 | | 2.7413 | 0.6569 | 3350 | 1.2780 | | 2.8832 | 0.6618 | 3375 | 1.2781 | | 2.782 | 0.6667 | 3400 | 1.2780 | | 2.7907 | 0.6717 | 3425 | 1.2779 | | 2.7367 | 0.6766 | 3450 | 1.2781 | | 2.8375 | 0.6815 | 3475 | 1.2780 | | 2.8279 | 0.6864 | 3500 | 1.2780 | | 2.7932 | 0.6913 | 3525 | 1.2781 | | 2.6823 | 0.6962 | 3550 | 1.2780 | | 2.7605 | 0.7011 | 3575 | 1.2779 | | 2.8804 | 0.7060 | 3600 | 1.2780 | | 2.769 | 0.7109 | 3625 | 1.2781 | | 2.6696 | 0.7158 | 3650 | 1.2783 | | 2.7543 | 0.7207 | 3675 | 1.2781 | | 2.7719 | 0.7256 | 3700 | 1.2780 | | 2.7031 | 0.7305 | 3725 | 1.2782 | | 2.9465 | 0.7354 | 3750 | 1.2780 | | 2.9159 | 0.7403 | 3775 | 1.2780 | | 2.8126 | 0.7452 | 3800 | 1.2780 | | 2.8721 | 0.7501 | 3825 | 1.2781 | | 2.875 | 0.7550 | 3850 | 1.2781 | | 2.7 | 0.7599 | 3875 | 1.2781 | | 2.8479 | 0.7648 | 3900 | 1.2782 | | 2.7953 | 0.7697 | 3925 | 1.2779 | | 2.7945 | 0.7746 | 3950 | 1.2778 | | 2.8429 | 0.7795 | 3975 | 1.2777 | | 2.8151 | 0.7844 | 4000 | 1.2779 | | 2.9726 | 0.7893 | 4025 | 1.2779 | | 2.6825 | 0.7942 | 4050 | 1.2781 | | 2.425 | 0.7991 | 4075 | 1.2781 | | 2.5809 | 0.8040 | 4100 | 1.2778 | | 3.0961 | 0.8089 | 4125 | 1.2780 | | 2.8323 | 0.8138 | 4150 | 1.2780 | | 2.7579 | 0.8187 | 4175 | 1.2781 | | 2.8227 | 0.8236 | 4200 | 1.2780 | | 3.046 | 0.8285 | 4225 | 1.2780 | | 2.593 | 0.8334 | 4250 | 1.2779 | | 3.0566 | 0.8383 | 4275 | 1.2779 | | 2.4772 | 0.8432 | 4300 | 1.2780 | | 2.9035 | 0.8481 | 4325 | 1.2779 | | 2.6611 | 0.8530 | 4350 | 1.2780 | | 2.789 | 0.8579 | 4375 | 1.2780 | | 2.5477 | 0.8629 | 4400 | 1.2781 | | 2.65 | 0.8678 | 4425 | 1.2780 | | 2.7394 | 0.8727 | 4450 | 1.2779 | | 2.9178 | 0.8776 | 4475 | 1.2781 | | 2.6875 | 0.8825 | 4500 | 1.2781 | | 2.6716 | 0.8874 | 4525 | 1.2780 | | 2.66 | 0.8923 | 4550 | 1.2783 | | 3.0137 | 0.8972 | 4575 | 1.2780 | | 2.714 | 0.9021 | 4600 | 1.2779 | | 2.8224 | 0.9070 | 4625 | 1.2780 | | 2.8566 | 0.9119 | 4650 | 1.2782 | | 2.6979 | 0.9168 | 4675 | 1.2781 | | 3.0773 | 0.9217 | 4700 | 1.2780 | | 2.7923 | 0.9266 | 4725 | 1.2780 | | 2.6275 | 0.9315 | 4750 | 1.2781 | | 2.8812 | 0.9364 | 4775 | 1.2779 | | 2.8417 | 0.9413 | 4800 | 1.2780 | | 2.8717 | 0.9462 | 4825 | 1.2780 | | 2.6871 | 0.9511 | 4850 | 1.2780 | | 2.8382 | 0.9560 | 4875 | 1.2781 | | 2.8615 | 0.9609 | 4900 | 1.2779 | | 2.9204 | 0.9658 | 4925 | 1.2779 | | 2.7162 | 0.9707 | 4950 | 1.2781 | | 2.5257 | 0.9756 | 4975 | 1.2780 | | 2.7771 | 0.9805 | 5000 | 1.2779 | | 2.8008 | 0.9854 | 5025 | 1.2780 | | 2.8659 | 0.9903 | 5050 | 1.2778 | | 2.8939 | 0.9952 | 5075 | 1.2781 | ### Framework versions - Transformers 4.51.3 - Pytorch 2.1.2+cu121 - Datasets 3.6.0 - Tokenizers 0.21.1