Configurations choice
Collection
Choice of configuration based on the results of different fine-tuning. All provide mor or less same results but 1 and 2 are way faster! (lr)
•
52 items
•
Updated
This model is a fine-tuned version of meta-llama/Meta-Llama-3.1-8B-Instruct on the GaetanMichelet/chat-60_ft_task-1 dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.1668 | 0.6957 | 2 | 2.0787 |
2.1494 | 1.7391 | 5 | 2.0737 |
2.127 | 2.7826 | 8 | 2.0633 |
2.023 | 3.8261 | 11 | 2.0470 |
2.0881 | 4.8696 | 14 | 2.0213 |
2.0378 | 5.9130 | 17 | 1.9882 |
2.0073 | 6.9565 | 20 | 1.9451 |
1.9467 | 8.0 | 23 | 1.8881 |
1.9133 | 8.6957 | 25 | 1.8427 |
1.8018 | 9.7391 | 28 | 1.7652 |
1.691 | 10.7826 | 31 | 1.6856 |
1.6135 | 11.8261 | 34 | 1.6190 |
1.5471 | 12.8696 | 37 | 1.5723 |
1.5155 | 13.9130 | 40 | 1.5440 |
1.4371 | 14.9565 | 43 | 1.5235 |
1.4825 | 16.0 | 46 | 1.5019 |
1.4532 | 16.6957 | 48 | 1.4894 |
1.4277 | 17.7391 | 51 | 1.4693 |
1.366 | 18.7826 | 54 | 1.4504 |
1.4417 | 19.8261 | 57 | 1.4330 |
1.3645 | 20.8696 | 60 | 1.4170 |
1.3153 | 21.9130 | 63 | 1.4029 |
1.3036 | 22.9565 | 66 | 1.3847 |
1.2775 | 24.0 | 69 | 1.3692 |
1.2726 | 24.6957 | 71 | 1.3621 |
1.2949 | 25.7391 | 74 | 1.3510 |
1.1424 | 26.7826 | 77 | 1.3406 |
1.2489 | 27.8261 | 80 | 1.3327 |
1.1662 | 28.8696 | 83 | 1.3225 |
1.1614 | 29.9130 | 86 | 1.3144 |
1.146 | 30.9565 | 89 | 1.3094 |
1.1177 | 32.0 | 92 | 1.3025 |
1.0748 | 32.6957 | 94 | 1.2985 |
1.118 | 33.7391 | 97 | 1.2957 |
1.0599 | 34.7826 | 100 | 1.2924 |
1.0607 | 35.8261 | 103 | 1.2912 |
1.0041 | 36.8696 | 106 | 1.2955 |
1.0132 | 37.9130 | 109 | 1.2980 |
1.0062 | 38.9565 | 112 | 1.3068 |
0.9466 | 40.0 | 115 | 1.3118 |
0.9728 | 40.6957 | 117 | 1.3147 |
0.882 | 41.7391 | 120 | 1.3195 |
0.9193 | 42.7826 | 123 | 1.3276 |
Base model
meta-llama/Llama-3.1-8B