roberta-large-finetuned-abbr-unfiltered-plod
This model is a fine-tuned version of roberta-large on the PLODv2 unfiltered dataset. It is released with our LREC-COLING 2024 publication Using character-level models for efficient abbreviation and long-form detection. It achieves the following results on the test set:
Results on abbreviations:
- Precision: 0.8916
- Recall: 0.9152
- F1: 0.9033
Results on long forms:
- Precision: 0.8607
- Recall: 0.9142
- F1: 0.8867
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 6
Training results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|---|
0.167 | 0.25 | 7000 | 0.1616 | 0.9484 | 0.9366 | 0.9424 | 0.9376 |
0.1673 | 0.49 | 14000 | 0.1459 | 0.9504 | 0.9370 | 0.9437 | 0.9389 |
0.1472 | 0.74 | 21000 | 0.1560 | 0.9531 | 0.9373 | 0.9451 | 0.9398 |
0.1519 | 0.98 | 28000 | 0.1434 | 0.9551 | 0.9382 | 0.9466 | 0.9415 |
0.1388 | 1.23 | 35000 | 0.1472 | 0.9516 | 0.9374 | 0.9444 | 0.9400 |
0.1291 | 1.48 | 42000 | 0.1416 | 0.9557 | 0.9403 | 0.9479 | 0.9431 |
0.1298 | 1.72 | 49000 | 0.1394 | 0.9577 | 0.9459 | 0.9517 | 0.9470 |
0.1269 | 1.97 | 56000 | 0.1401 | 0.9587 | 0.9446 | 0.9516 | 0.9468 |
0.1128 | 2.21 | 63000 | 0.1410 | 0.9568 | 0.9497 | 0.9533 | 0.9486 |
0.1154 | 2.46 | 70000 | 0.1366 | 0.9583 | 0.9495 | 0.9539 | 0.9493 |
0.1138 | 2.71 | 77000 | 0.1413 | 0.9600 | 0.9502 | 0.9551 | 0.9506 |
0.1117 | 2.95 | 84000 | 0.1313 | 0.9605 | 0.9501 | 0.9552 | 0.9508 |
0.0997 | 3.2 | 91000 | 0.1503 | 0.9577 | 0.9527 | 0.9552 | 0.9507 |
0.1008 | 3.44 | 98000 | 0.1360 | 0.9587 | 0.9536 | 0.9561 | 0.9515 |
0.0909 | 3.69 | 105000 | 0.1435 | 0.9619 | 0.9520 | 0.9569 | 0.9525 |
0.0903 | 3.93 | 112000 | 0.1482 | 0.9619 | 0.9522 | 0.9570 | 0.9528 |
0.075 | 4.18 | 119000 | 0.1603 | 0.9616 | 0.9546 | 0.9581 | 0.9537 |
0.0804 | 4.43 | 126000 | 0.1512 | 0.9600 | 0.9560 | 0.9580 | 0.9536 |
0.0811 | 4.67 | 133000 | 0.1435 | 0.9628 | 0.9543 | 0.9585 | 0.9540 |
0.0778 | 4.92 | 140000 | 0.1384 | 0.9616 | 0.9566 | 0.9591 | 0.9548 |
0.065 | 5.16 | 147000 | 0.1640 | 0.9622 | 0.9567 | 0.9595 | 0.9550 |
0.0607 | 5.41 | 154000 | 0.1755 | 0.9632 | 0.9562 | 0.9597 | 0.9554 |
0.0587 | 5.66 | 161000 | 0.1643 | 0.9622 | 0.9575 | 0.9599 | 0.9555 |
0.062 | 5.9 | 168000 | 0.1663 | 0.9628 | 0.9569 | 0.9598 | 0.9556 |
Framework versions
- Transformers 4.16.2
- Pytorch 1.11.0
- Datasets 2.1.0
- Tokenizers 0.10.3
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.