GS_bert3
This model is a fine-tuned version of biblo0507/GS_bert2 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.1019
- F1: 0.5384
- Precision: 0.5630
- Recall: 0.5199
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 100
Training results
Training Loss | Epoch | Step | Validation Loss | F1 | Precision | Recall |
---|---|---|---|---|---|---|
0.6824 | 1.0 | 45 | 0.6456 | 0.0468 | 0.05 | 0.0444 |
0.5373 | 2.0 | 90 | 0.5130 | 0.0423 | 0.0444 | 0.0407 |
0.4462 | 3.0 | 135 | 0.4115 | 0.0270 | 0.0278 | 0.0264 |
0.3407 | 4.0 | 180 | 0.3163 | 0.0304 | 0.0315 | 0.0296 |
0.2699 | 5.0 | 225 | 0.2430 | 0.0352 | 0.0370 | 0.0338 |
0.2133 | 6.0 | 270 | 0.1989 | 0.0349 | 0.0370 | 0.0333 |
0.1896 | 7.0 | 315 | 0.1770 | 0.0450 | 0.0481 | 0.0426 |
0.1754 | 8.0 | 360 | 0.1672 | 0.0450 | 0.0481 | 0.0426 |
0.1708 | 9.0 | 405 | 0.1630 | 0.0439 | 0.0463 | 0.0421 |
0.1675 | 10.0 | 450 | 0.1609 | 0.0709 | 0.0759 | 0.0671 |
0.165 | 11.0 | 495 | 0.1587 | 0.1754 | 0.1852 | 0.1681 |
0.1613 | 12.0 | 540 | 0.1564 | 0.2225 | 0.2333 | 0.2144 |
0.1587 | 13.0 | 585 | 0.1538 | 0.2921 | 0.3056 | 0.2819 |
0.1549 | 14.0 | 630 | 0.1512 | 0.3347 | 0.35 | 0.3231 |
0.1519 | 15.0 | 675 | 0.1485 | 0.3479 | 0.3648 | 0.3352 |
0.1491 | 16.0 | 720 | 0.1458 | 0.3534 | 0.3722 | 0.3394 |
0.1442 | 17.0 | 765 | 0.1435 | 0.3751 | 0.3944 | 0.3606 |
0.1418 | 18.0 | 810 | 0.1408 | 0.3873 | 0.4074 | 0.3722 |
0.1377 | 19.0 | 855 | 0.1386 | 0.3950 | 0.4148 | 0.3801 |
0.1336 | 20.0 | 900 | 0.1365 | 0.4000 | 0.4204 | 0.3847 |
0.133 | 21.0 | 945 | 0.1344 | 0.4048 | 0.4259 | 0.3889 |
0.1294 | 22.0 | 990 | 0.1322 | 0.4188 | 0.4407 | 0.4023 |
0.1263 | 23.0 | 1035 | 0.1301 | 0.4138 | 0.4352 | 0.3977 |
0.1255 | 24.0 | 1080 | 0.1293 | 0.4138 | 0.4352 | 0.3977 |
0.1199 | 25.0 | 1125 | 0.1269 | 0.4228 | 0.4444 | 0.4065 |
0.1183 | 26.0 | 1170 | 0.1255 | 0.4294 | 0.4519 | 0.4125 |
0.1129 | 27.0 | 1215 | 0.1239 | 0.4497 | 0.4741 | 0.4315 |
0.1117 | 28.0 | 1260 | 0.1227 | 0.4540 | 0.4778 | 0.4361 |
0.1109 | 29.0 | 1305 | 0.1214 | 0.4561 | 0.4796 | 0.4384 |
0.1067 | 30.0 | 1350 | 0.1199 | 0.4593 | 0.4833 | 0.4412 |
0.1053 | 31.0 | 1395 | 0.1190 | 0.4569 | 0.4796 | 0.4398 |
0.1018 | 32.0 | 1440 | 0.1181 | 0.4722 | 0.4963 | 0.4542 |
0.0995 | 33.0 | 1485 | 0.1168 | 0.4780 | 0.5019 | 0.4602 |
0.0986 | 34.0 | 1530 | 0.1157 | 0.4794 | 0.5037 | 0.4611 |
0.0944 | 35.0 | 1575 | 0.1153 | 0.4765 | 0.5 | 0.4588 |
0.0913 | 36.0 | 1620 | 0.1139 | 0.4958 | 0.5204 | 0.4773 |
0.0906 | 37.0 | 1665 | 0.1131 | 0.4886 | 0.5130 | 0.4704 |
0.0904 | 38.0 | 1710 | 0.1120 | 0.4995 | 0.5241 | 0.4810 |
0.0869 | 39.0 | 1755 | 0.1118 | 0.5048 | 0.5296 | 0.4861 |
0.0838 | 40.0 | 1800 | 0.1113 | 0.5119 | 0.5370 | 0.4931 |
0.0827 | 41.0 | 1845 | 0.1100 | 0.5082 | 0.5333 | 0.4894 |
0.0819 | 42.0 | 1890 | 0.1097 | 0.5034 | 0.5278 | 0.4852 |
0.0792 | 43.0 | 1935 | 0.1094 | 0.5212 | 0.5463 | 0.5023 |
0.0778 | 44.0 | 1980 | 0.1086 | 0.5138 | 0.5389 | 0.4949 |
0.0753 | 45.0 | 2025 | 0.1075 | 0.5272 | 0.5537 | 0.5074 |
0.075 | 46.0 | 2070 | 0.1082 | 0.5217 | 0.5463 | 0.5032 |
0.0728 | 47.0 | 2115 | 0.1074 | 0.5180 | 0.5426 | 0.4995 |
0.0698 | 48.0 | 2160 | 0.1066 | 0.5249 | 0.55 | 0.5060 |
0.0695 | 49.0 | 2205 | 0.1065 | 0.5251 | 0.55 | 0.5065 |
0.0684 | 50.0 | 2250 | 0.1057 | 0.5196 | 0.5444 | 0.5009 |
0.0686 | 51.0 | 2295 | 0.1057 | 0.5183 | 0.5426 | 0.5 |
0.065 | 52.0 | 2340 | 0.1058 | 0.5164 | 0.5407 | 0.4981 |
0.0634 | 53.0 | 2385 | 0.1046 | 0.5254 | 0.55 | 0.5069 |
0.0621 | 54.0 | 2430 | 0.1051 | 0.5339 | 0.5593 | 0.5148 |
0.061 | 55.0 | 2475 | 0.1043 | 0.5304 | 0.5556 | 0.5116 |
0.0611 | 56.0 | 2520 | 0.1043 | 0.5304 | 0.5556 | 0.5116 |
0.0587 | 57.0 | 2565 | 0.1043 | 0.5286 | 0.5537 | 0.5097 |
0.0576 | 58.0 | 2610 | 0.1037 | 0.5251 | 0.55 | 0.5065 |
0.0577 | 59.0 | 2655 | 0.1034 | 0.5230 | 0.5481 | 0.5042 |
0.0551 | 60.0 | 2700 | 0.1029 | 0.5291 | 0.5537 | 0.5106 |
0.054 | 61.0 | 2745 | 0.1034 | 0.5267 | 0.5519 | 0.5079 |
0.0521 | 62.0 | 2790 | 0.1036 | 0.5267 | 0.5519 | 0.5079 |
0.0521 | 63.0 | 2835 | 0.1029 | 0.5251 | 0.55 | 0.5065 |
0.0515 | 64.0 | 2880 | 0.1024 | 0.5272 | 0.5519 | 0.5088 |
0.0501 | 65.0 | 2925 | 0.1033 | 0.5235 | 0.5481 | 0.5051 |
0.0493 | 66.0 | 2970 | 0.1031 | 0.5222 | 0.5463 | 0.5042 |
0.0489 | 67.0 | 3015 | 0.1024 | 0.5294 | 0.5537 | 0.5111 |
0.0491 | 68.0 | 3060 | 0.1022 | 0.5468 | 0.5722 | 0.5278 |
0.0479 | 69.0 | 3105 | 0.1019 | 0.5339 | 0.5593 | 0.5148 |
0.0456 | 70.0 | 3150 | 0.1028 | 0.5310 | 0.5556 | 0.5125 |
0.0453 | 71.0 | 3195 | 0.1025 | 0.5331 | 0.5574 | 0.5148 |
0.0459 | 72.0 | 3240 | 0.1021 | 0.5349 | 0.5593 | 0.5167 |
0.0451 | 73.0 | 3285 | 0.1023 | 0.5310 | 0.5556 | 0.5125 |
0.0447 | 74.0 | 3330 | 0.1019 | 0.5447 | 0.5704 | 0.5255 |
0.0446 | 75.0 | 3375 | 0.1020 | 0.5362 | 0.5611 | 0.5176 |
0.0434 | 76.0 | 3420 | 0.1020 | 0.5365 | 0.5611 | 0.5181 |
0.0429 | 77.0 | 3465 | 0.1021 | 0.5418 | 0.5667 | 0.5231 |
0.0427 | 78.0 | 3510 | 0.1017 | 0.5362 | 0.5611 | 0.5176 |
0.042 | 79.0 | 3555 | 0.1021 | 0.5280 | 0.5519 | 0.5102 |
0.0424 | 80.0 | 3600 | 0.1020 | 0.5437 | 0.5685 | 0.525 |
0.0407 | 81.0 | 3645 | 0.1023 | 0.5241 | 0.5481 | 0.5060 |
0.041 | 82.0 | 3690 | 0.1017 | 0.5294 | 0.5537 | 0.5111 |
0.0395 | 83.0 | 3735 | 0.1018 | 0.5312 | 0.5556 | 0.5130 |
0.0406 | 84.0 | 3780 | 0.1016 | 0.5352 | 0.5593 | 0.5171 |
0.0388 | 85.0 | 3825 | 0.1017 | 0.5368 | 0.5611 | 0.5185 |
0.0391 | 86.0 | 3870 | 0.1021 | 0.5296 | 0.5537 | 0.5116 |
0.0378 | 87.0 | 3915 | 0.1020 | 0.5347 | 0.5593 | 0.5162 |
0.0377 | 88.0 | 3960 | 0.1022 | 0.5328 | 0.5574 | 0.5144 |
0.0387 | 89.0 | 4005 | 0.1025 | 0.5328 | 0.5574 | 0.5144 |
0.0375 | 90.0 | 4050 | 0.1018 | 0.5402 | 0.5648 | 0.5218 |
0.0372 | 91.0 | 4095 | 0.1020 | 0.5328 | 0.5574 | 0.5144 |
0.0381 | 92.0 | 4140 | 0.1020 | 0.5333 | 0.5574 | 0.5153 |
0.0374 | 93.0 | 4185 | 0.1020 | 0.5365 | 0.5611 | 0.5181 |
0.0363 | 94.0 | 4230 | 0.1018 | 0.5365 | 0.5611 | 0.5181 |
0.0372 | 95.0 | 4275 | 0.1020 | 0.5384 | 0.5630 | 0.5199 |
0.0373 | 96.0 | 4320 | 0.1018 | 0.5386 | 0.5630 | 0.5204 |
0.0366 | 97.0 | 4365 | 0.1019 | 0.5368 | 0.5611 | 0.5185 |
0.0365 | 98.0 | 4410 | 0.1020 | 0.5402 | 0.5648 | 0.5218 |
0.0369 | 99.0 | 4455 | 0.1019 | 0.5386 | 0.5630 | 0.5204 |
0.0356 | 100.0 | 4500 | 0.1019 | 0.5384 | 0.5630 | 0.5199 |
Framework versions
- Transformers 4.51.0.dev0
- Pytorch 2.5.1+cu121
- Datasets 3.4.1
- Tokenizers 0.21.0
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support