shroom-semeval25/cogumelo-hallucinations-detector-flan-t5-xl-qa
This model is a fine-tuned version of google/flan-t5-xl on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.1546
- Precision: 0.5728
- Recall: 0.5138
- F1: 0.5417
- Accuracy: 0.9463
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 132
- eval_batch_size: 132
- seed: 0
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Accuracy | F1 | Validation Loss | Precision | Recall |
---|---|---|---|---|---|---|---|
0.4449 | 0.0177 | 100 | 0.9004 | 0.1368 | 0.2853 | 0.1221 | 0.1556 |
0.3201 | 0.0353 | 200 | 0.9187 | 0.2224 | 0.2333 | 0.2243 | 0.2205 |
0.306 | 0.0530 | 300 | 0.9206 | 0.2096 | 0.2205 | 0.2454 | 0.1829 |
0.2721 | 0.0706 | 400 | 0.9258 | 0.2850 | 0.2092 | 0.2796 | 0.2906 |
0.2742 | 0.0883 | 500 | 0.9220 | 0.2475 | 0.2045 | 0.2941 | 0.2137 |
0.2579 | 0.1059 | 600 | 0.9297 | 0.3225 | 0.1946 | 0.3327 | 0.3128 |
0.2783 | 0.1236 | 700 | 0.9295 | 0.3194 | 0.1949 | 0.3191 | 0.3197 |
0.2693 | 0.1412 | 800 | 0.9323 | 0.3167 | 0.1902 | 0.3365 | 0.2991 |
0.2494 | 0.1589 | 900 | 0.9331 | 0.3351 | 0.1894 | 0.3368 | 0.3333 |
0.2529 | 0.1765 | 1000 | 0.9322 | 0.2970 | 0.1837 | 0.3529 | 0.2564 |
0.2494 | 0.1942 | 1100 | 0.9348 | 0.3391 | 0.1804 | 0.3633 | 0.3179 |
0.2451 | 0.2118 | 1200 | 0.9328 | 0.3616 | 0.1916 | 0.3453 | 0.3795 |
0.229 | 0.2295 | 1300 | 0.9336 | 0.3599 | 0.1788 | 0.3643 | 0.3556 |
0.2382 | 0.2471 | 1400 | 0.9368 | 0.3465 | 0.1727 | 0.3648 | 0.3299 |
0.236 | 0.2648 | 1500 | 0.9390 | 0.3609 | 0.1713 | 0.3683 | 0.3538 |
0.2394 | 0.2824 | 1600 | 0.9382 | 0.3361 | 0.1669 | 0.3655 | 0.3111 |
0.2405 | 0.3001 | 1700 | 0.9366 | 0.3714 | 0.1737 | 0.3790 | 0.3641 |
0.2297 | 0.3177 | 1800 | 0.9397 | 0.3616 | 0.1665 | 0.3734 | 0.3504 |
0.2231 | 0.3354 | 1900 | 0.9387 | 0.3538 | 0.1653 | 0.3748 | 0.3350 |
0.2095 | 0.3530 | 2000 | 0.9399 | 0.3736 | 0.1616 | 0.3958 | 0.3538 |
0.2329 | 0.3707 | 2100 | 0.9382 | 0.3659 | 0.1674 | 0.3611 | 0.3709 |
0.2048 | 0.3883 | 2200 | 0.9397 | 0.4334 | 0.1593 | 0.4614 | 0.4085 |
0.2236 | 0.4060 | 2300 | 0.9411 | 0.4516 | 0.1666 | 0.4704 | 0.4342 |
0.1982 | 0.4237 | 2400 | 0.9425 | 0.4367 | 0.1482 | 0.4884 | 0.3949 |
0.2149 | 0.4413 | 2500 | 0.9437 | 0.4705 | 0.1511 | 0.5207 | 0.4291 |
0.1926 | 0.4590 | 2600 | 0.9445 | 0.4611 | 0.1458 | 0.5350 | 0.4051 |
0.1997 | 0.4766 | 2700 | 0.9439 | 0.4721 | 0.1456 | 0.5274 | 0.4274 |
0.2003 | 0.4943 | 2800 | 0.9453 | 0.4486 | 0.1438 | 0.5252 | 0.3915 |
0.203 | 0.5119 | 2900 | 0.9408 | 0.4652 | 0.1568 | 0.4942 | 0.4393 |
0.1932 | 0.5296 | 3000 | 0.9429 | 0.4917 | 0.1518 | 0.5375 | 0.4530 |
0.1996 | 0.5472 | 3100 | 0.9432 | 0.4790 | 0.1491 | 0.5266 | 0.4393 |
0.1999 | 0.5649 | 3200 | 0.9445 | 0.4629 | 0.1444 | 0.5226 | 0.4154 |
0.2073 | 0.5825 | 3300 | 0.9436 | 0.4717 | 0.1444 | 0.5163 | 0.4342 |
0.2168 | 0.6002 | 3400 | 0.9452 | 0.4576 | 0.1415 | 0.5408 | 0.3966 |
0.1899 | 0.6178 | 3500 | 0.9479 | 0.4714 | 0.1394 | 0.5448 | 0.4154 |
0.2064 | 0.6355 | 3600 | 0.1415 | 0.5405 | 0.3880 | 0.4517 | 0.9456 |
0.1839 | 0.6531 | 3700 | 0.1402 | 0.5254 | 0.4427 | 0.4805 | 0.9453 |
0.1833 | 0.6708 | 3800 | 0.1451 | 0.5203 | 0.4821 | 0.5004 | 0.9431 |
0.1927 | 0.6884 | 3900 | 0.1408 | 0.5689 | 0.4376 | 0.4947 | 0.9464 |
0.1811 | 0.7061 | 4000 | 0.1407 | 0.5649 | 0.4462 | 0.4986 | 0.9452 |
0.1717 | 0.7237 | 4100 | 0.1439 | 0.5245 | 0.4581 | 0.4891 | 0.9438 |
0.1788 | 0.7414 | 4200 | 0.1423 | 0.5284 | 0.4769 | 0.5013 | 0.9451 |
0.1921 | 0.7590 | 4300 | 0.1396 | 0.5264 | 0.4427 | 0.4810 | 0.9469 |
0.1969 | 0.7767 | 4400 | 0.1388 | 0.5433 | 0.4615 | 0.4991 | 0.9477 |
0.1739 | 0.7944 | 4500 | 0.1372 | 0.5991 | 0.4444 | 0.5103 | 0.9482 |
0.19 | 0.8120 | 4600 | 0.1377 | 0.5606 | 0.4188 | 0.4795 | 0.9468 |
0.1922 | 0.8297 | 4700 | 0.1351 | 0.5508 | 0.4632 | 0.5032 | 0.9467 |
0.1736 | 0.8473 | 4800 | 0.1415 | 0.5505 | 0.4752 | 0.5101 | 0.9471 |
0.1768 | 0.8650 | 4900 | 0.1421 | 0.5207 | 0.4735 | 0.4960 | 0.9467 |
0.1959 | 0.8826 | 5000 | 0.1347 | 0.5445 | 0.4496 | 0.4925 | 0.9465 |
0.1628 | 0.9003 | 5100 | 0.1371 | 0.5742 | 0.4632 | 0.5128 | 0.9465 |
0.191 | 0.9179 | 5200 | 0.1350 | 0.5396 | 0.4547 | 0.4935 | 0.9468 |
0.1752 | 0.9356 | 5300 | 0.1363 | 0.5879 | 0.4632 | 0.5182 | 0.9469 |
0.1898 | 0.9532 | 5400 | 0.1342 | 0.5630 | 0.4581 | 0.5052 | 0.9469 |
0.1984 | 0.9709 | 5500 | 0.1376 | 0.5571 | 0.4923 | 0.5227 | 0.9468 |
0.1897 | 0.9885 | 5600 | 0.1337 | 0.5756 | 0.4684 | 0.5165 | 0.9476 |
0.1668 | 1.0062 | 5700 | 0.1376 | 0.5449 | 0.4974 | 0.5201 | 0.9473 |
0.1672 | 1.0238 | 5800 | 0.1392 | 0.5526 | 0.4940 | 0.5217 | 0.9467 |
0.1386 | 1.0415 | 5900 | 0.1382 | 0.5437 | 0.4786 | 0.5091 | 0.9468 |
0.165 | 1.0591 | 6000 | 0.1350 | 0.5677 | 0.4872 | 0.5244 | 0.9477 |
0.1568 | 1.0768 | 6100 | 0.1397 | 0.5809 | 0.4786 | 0.5248 | 0.9485 |
0.1585 | 1.0944 | 6200 | 0.1330 | 0.5562 | 0.4906 | 0.5213 | 0.9497 |
0.1572 | 1.1121 | 6300 | 0.1381 | 0.5794 | 0.4991 | 0.5363 | 0.9501 |
0.1717 | 1.1297 | 6400 | 0.1310 | 0.5873 | 0.4889 | 0.5336 | 0.9534 |
0.1665 | 1.1474 | 6500 | 0.1286 | 0.5956 | 0.4632 | 0.5212 | 0.9512 |
0.1844 | 1.1650 | 6600 | 0.1356 | 0.5486 | 0.4923 | 0.5189 | 0.9507 |
0.1567 | 1.1827 | 6700 | 0.1354 | 0.5739 | 0.5043 | 0.5369 | 0.9508 |
0.1665 | 1.2004 | 6800 | 0.1357 | 0.5764 | 0.5094 | 0.5408 | 0.9517 |
0.1571 | 1.2180 | 6900 | 0.1367 | 0.5723 | 0.5009 | 0.5342 | 0.9485 |
0.1558 | 1.2357 | 7000 | 0.1330 | 0.5864 | 0.5162 | 0.5491 | 0.9521 |
0.1393 | 1.2533 | 7100 | 0.1324 | 0.5922 | 0.4940 | 0.5387 | 0.9510 |
0.1661 | 1.2710 | 7200 | 0.1317 | 0.5802 | 0.5009 | 0.5376 | 0.9513 |
0.1567 | 1.2886 | 7300 | 0.1331 | 0.5904 | 0.5248 | 0.5557 | 0.9508 |
0.1506 | 1.3063 | 7400 | 0.1347 | 0.5706 | 0.5179 | 0.5430 | 0.9495 |
0.1704 | 1.3239 | 7500 | 0.1283 | 0.5907 | 0.4786 | 0.5288 | 0.9522 |
0.1685 | 1.3416 | 7600 | 0.1268 | 0.5854 | 0.4923 | 0.5348 | 0.9526 |
0.152 | 1.3592 | 7700 | 0.1249 | 0.6059 | 0.4889 | 0.5412 | 0.9537 |
0.1438 | 1.3769 | 7800 | 0.1238 | 0.5900 | 0.4821 | 0.5306 | 0.9546 |
0.1738 | 1.3945 | 7900 | 0.1278 | 0.5873 | 0.5231 | 0.5533 | 0.9537 |
0.1637 | 1.4122 | 8000 | 0.1271 | 0.6071 | 0.5231 | 0.5620 | 0.9535 |
0.1746 | 1.4298 | 8100 | 0.1278 | 0.6212 | 0.5214 | 0.5669 | 0.9548 |
0.1631 | 1.4475 | 8200 | 0.1275 | 0.6044 | 0.5197 | 0.5588 | 0.9538 |
0.1571 | 1.4651 | 8300 | 0.1274 | 0.6152 | 0.4838 | 0.5416 | 0.9522 |
0.1567 | 1.4828 | 8400 | 0.1271 | 0.5938 | 0.4923 | 0.5383 | 0.9529 |
0.1563 | 1.5004 | 8500 | 0.1301 | 0.5903 | 0.5197 | 0.5527 | 0.9518 |
0.1681 | 1.5181 | 8600 | 0.1302 | 0.5765 | 0.5282 | 0.5513 | 0.9522 |
0.156 | 1.5357 | 8700 | 0.1282 | 0.5996 | 0.5197 | 0.5568 | 0.9528 |
0.1562 | 1.5534 | 8800 | 0.1268 | 0.6143 | 0.5145 | 0.56 | 0.9531 |
0.1653 | 1.5711 | 8900 | 0.1320 | 0.6140 | 0.5111 | 0.5578 | 0.9536 |
0.1649 | 1.5887 | 9000 | 0.1267 | 0.6004 | 0.5265 | 0.5610 | 0.9535 |
0.1618 | 1.6064 | 9100 | 0.1299 | 0.6047 | 0.5282 | 0.5639 | 0.9554 |
0.1629 | 1.6240 | 9200 | 0.1251 | 0.625 | 0.5128 | 0.5634 | 0.9566 |
0.1559 | 1.6417 | 9300 | 0.1264 | 0.6082 | 0.5094 | 0.5544 | 0.9549 |
0.1571 | 1.6593 | 9400 | 0.1301 | 0.6104 | 0.5197 | 0.5614 | 0.9529 |
0.154 | 1.6770 | 9500 | 0.1294 | 0.6342 | 0.5128 | 0.5671 | 0.9547 |
0.1764 | 1.6946 | 9600 | 0.1291 | 0.6190 | 0.5333 | 0.5730 | 0.9561 |
0.1511 | 1.7123 | 9700 | 0.1277 | 0.6184 | 0.5043 | 0.5556 | 0.9549 |
0.1594 | 1.7299 | 9800 | 0.1275 | 0.6264 | 0.4872 | 0.5481 | 0.9541 |
0.1531 | 1.7476 | 9900 | 0.1280 | 0.6069 | 0.5145 | 0.5569 | 0.9538 |
0.1572 | 1.7652 | 10000 | 0.1264 | 0.6112 | 0.5026 | 0.5516 | 0.9540 |
0.1588 | 1.7829 | 10100 | 0.1239 | 0.5664 | 0.5179 | 0.5411 | 0.9538 |
0.1456 | 1.8005 | 10200 | 0.1232 | 0.6143 | 0.5282 | 0.5680 | 0.9559 |
0.1604 | 1.8182 | 10300 | 0.1255 | 0.6130 | 0.5470 | 0.5781 | 0.9554 |
0.1379 | 1.8358 | 10400 | 0.1236 | 0.6085 | 0.5128 | 0.5566 | 0.9562 |
0.1448 | 1.8535 | 10500 | 0.1252 | 0.6176 | 0.5162 | 0.5624 | 0.9550 |
0.152 | 1.8711 | 10600 | 0.1235 | 0.6109 | 0.4991 | 0.5494 | 0.9541 |
0.1478 | 1.8888 | 10700 | 0.1217 | 0.6125 | 0.5350 | 0.5712 | 0.9553 |
0.1522 | 1.9064 | 10800 | 0.1224 | 0.6132 | 0.5094 | 0.5565 | 0.9547 |
0.1458 | 1.9241 | 10900 | 0.1239 | 0.6309 | 0.5231 | 0.5720 | 0.9534 |
0.155 | 1.9417 | 11000 | 0.1263 | 0.6265 | 0.5248 | 0.5712 | 0.9537 |
0.1525 | 1.9594 | 11100 | 0.1238 | 0.618 | 0.5282 | 0.5696 | 0.9555 |
0.1499 | 1.9771 | 11200 | 0.1220 | 0.6283 | 0.5316 | 0.5759 | 0.9549 |
0.138 | 1.9947 | 11300 | 0.1240 | 0.6068 | 0.5197 | 0.5599 | 0.9531 |
0.1395 | 2.0124 | 11400 | 0.1298 | 0.5870 | 0.5248 | 0.5542 | 0.9528 |
0.1423 | 2.0300 | 11500 | 0.1261 | 0.5843 | 0.5214 | 0.5510 | 0.9532 |
0.1488 | 2.0477 | 11600 | 0.1282 | 0.5930 | 0.5179 | 0.5529 | 0.9546 |
0.1356 | 2.0653 | 11700 | 0.1284 | 0.5899 | 0.4991 | 0.5407 | 0.9532 |
0.1234 | 2.0830 | 11800 | 0.1258 | 0.6284 | 0.5145 | 0.5658 | 0.9544 |
Framework versions
- Transformers 4.44.2
- Pytorch 2.3.1+cu121
- Datasets 2.21.0
- Tokenizers 0.19.1
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for shroom-semeval25/cogumelo-hallucinations-detector-flan-t5-xl-qa-v2
Base model
google/flan-t5-xl