|
--- |
|
license: apache-2.0 |
|
base_model: allenai/longformer-base-4096 |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- essays_su_g |
|
metrics: |
|
- accuracy |
|
model-index: |
|
- name: longformer-simple |
|
results: |
|
- task: |
|
name: Token Classification |
|
type: token-classification |
|
dataset: |
|
name: essays_su_g |
|
type: essays_su_g |
|
config: simple |
|
split: train[80%:100%] |
|
args: simple |
|
metrics: |
|
- name: Accuracy |
|
type: accuracy |
|
value: 0.8420290379811 |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# longformer-simple |
|
|
|
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 0.4966 |
|
- Claim: {'precision': 0.5958668197474167, 'recall': 0.6226007677543186, 'f1-score': 0.6089405139035551, 'support': 4168.0} |
|
- Majorclaim: {'precision': 0.7666666666666667, 'recall': 0.8122676579925651, 'f1-score': 0.7888086642599278, 'support': 2152.0} |
|
- O: {'precision': 0.934957507082153, 'recall': 0.8943203988727509, 'f1-score': 0.9141875796354773, 'support': 9226.0} |
|
- Premise: {'precision': 0.8768813224771774, 'recall': 0.8831276401888511, 'f1-score': 0.8799933971607792, 'support': 12073.0} |
|
- Accuracy: 0.8420 |
|
- Macro avg: {'precision': 0.7935930789933534, 'recall': 0.8030791162021215, 'f1-score': 0.7979825387399347, 'support': 27619.0} |
|
- Weighted avg: {'precision': 0.8452856996263733, 'recall': 0.8420290379811, 'f1-score': 0.843406176946174, 'support': 27619.0} |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 8 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 9 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Claim | Majorclaim | O | Premise | Accuracy | Macro avg | Weighted avg | |
|
|:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:| |
|
| No log | 1.0 | 41 | 0.5668 | {'precision': 0.5033522434244456, 'recall': 0.23416506717850288, 'f1-score': 0.3196332077943344, 'support': 4168.0} | {'precision': 0.5278396436525612, 'recall': 0.6607806691449815, 'f1-score': 0.58687577383409, 'support': 2152.0} | {'precision': 0.9158374295648004, 'recall': 0.8279861261651853, 'f1-score': 0.8696988671941708, 'support': 9226.0} | {'precision': 0.7802663024923182, 'recall': 0.9464921726165825, 'f1-score': 0.8553783965865709, 'support': 12073.0} | 0.7771 | {'precision': 0.6818239047835313, 'recall': 0.6673560087763131, 'f1-score': 0.6578965613522916, 'support': 27619.0} | {'precision': 0.76409552333133, 'recall': 0.7771461674933923, 'f1-score': 0.7583914336543988, 'support': 27619.0} | |
|
| No log | 2.0 | 82 | 0.4435 | {'precision': 0.5961424332344214, 'recall': 0.4820057581573896, 'f1-score': 0.5330326346511012, 'support': 4168.0} | {'precision': 0.7147918511957484, 'recall': 0.75, 'f1-score': 0.7319727891156461, 'support': 2152.0} | {'precision': 0.9080421885299934, 'recall': 0.8958378495556037, 'f1-score': 0.9018987341772151, 'support': 9226.0} | {'precision': 0.8463030491116456, 'recall': 0.9035036859107098, 'f1-score': 0.8739684320166653, 'support': 12073.0} | 0.8254 | {'precision': 0.7663198805179522, 'recall': 0.7578368234059257, 'f1-score': 0.7602181474901569, 'support': 27619.0} | {'precision': 0.8189278275389019, 'recall': 0.8253738368514428, 'f1-score': 0.8207836657612095, 'support': 27619.0} | |
|
| No log | 3.0 | 123 | 0.4314 | {'precision': 0.573660177841865, 'recall': 0.5726967370441459, 'f1-score': 0.5731780525873454, 'support': 4168.0} | {'precision': 0.6975589573851882, 'recall': 0.783457249070632, 'f1-score': 0.7380170715692711, 'support': 2152.0} | {'precision': 0.9517867958812841, 'recall': 0.8516150010838933, 'f1-score': 0.8989188261541102, 'support': 9226.0} | {'precision': 0.8546066009698108, 'recall': 0.9050774455396339, 'f1-score': 0.8791182267991471, 'support': 12073.0} | 0.8276 | {'precision': 0.769403133019537, 'recall': 0.7782116081845762, 'f1-score': 0.7723080442774685, 'support': 27619.0} | {'precision': 0.8324346634507792, 'recall': 0.8275824613490713, 'f1-score': 0.8285686774845233, 'support': 27619.0} | |
|
| No log | 4.0 | 164 | 0.4209 | {'precision': 0.6254607314998583, 'recall': 0.5292706333973128, 'f1-score': 0.5733593242365173, 'support': 4168.0} | {'precision': 0.7988077496274217, 'recall': 0.7472118959107806, 'f1-score': 0.7721488595438176, 'support': 2152.0} | {'precision': 0.9066852367688022, 'recall': 0.917298937784522, 'f1-score': 0.9119612068965518, 'support': 9226.0} | {'precision': 0.8564927422518634, 'recall': 0.904166321543941, 'f1-score': 0.8796841002498186, 'support': 12073.0} | 0.8397 | {'precision': 0.7968616150369864, 'recall': 0.7744869471591391, 'f1-score': 0.7842883727316763, 'support': 27619.0} | {'precision': 0.8338994705719012, 'recall': 0.8397479995655165, 'f1-score': 0.83585959833085, 'support': 27619.0} | |
|
| No log | 5.0 | 205 | 0.4309 | {'precision': 0.5909090909090909, 'recall': 0.6487523992322457, 'f1-score': 0.6184812442817932, 'support': 4168.0} | {'precision': 0.7423312883435583, 'recall': 0.8434014869888475, 'f1-score': 0.7896454209266913, 'support': 2152.0} | {'precision': 0.9306642809214941, 'recall': 0.9020160416215044, 'f1-score': 0.916116248348745, 'support': 9226.0} | {'precision': 0.8933596431022649, 'recall': 0.8625031061045307, 'f1-score': 0.8776602469552024, 'support': 12073.0} | 0.8420 | {'precision': 0.7893160758191021, 'recall': 0.8141682584867821, 'f1-score': 0.8004757901281081, 'support': 27619.0} | {'precision': 0.8484103570143661, 'recall': 0.841956624063145, 'f1-score': 0.8445355530886866, 'support': 27619.0} | |
|
| No log | 6.0 | 246 | 0.4454 | {'precision': 0.6128436128436129, 'recall': 0.6365163147792706, 'f1-score': 0.6244556902436155, 'support': 4168.0} | {'precision': 0.7766771724448447, 'recall': 0.8015799256505576, 'f1-score': 0.7889320832380518, 'support': 2152.0} | {'precision': 0.9283890307538581, 'recall': 0.9063516150010839, 'f1-score': 0.9172379751000932, 'support': 9226.0} | {'precision': 0.8828552478859227, 'recall': 0.8820508572848504, 'f1-score': 0.8824528692769836, 'support': 12073.0} | 0.8468 | {'precision': 0.8001912659820596, 'recall': 0.8066246781789406, 'f1-score': 0.803269654464686, 'support': 27619.0} | {'precision': 0.8490448625545937, 'recall': 0.8468445635251095, 'f1-score': 0.8478512693840529, 'support': 27619.0} | |
|
| No log | 7.0 | 287 | 0.4743 | {'precision': 0.6026775041836003, 'recall': 0.6048464491362764, 'f1-score': 0.6037600287390732, 'support': 4168.0} | {'precision': 0.7918405192396848, 'recall': 0.7936802973977695, 'f1-score': 0.7927593409143652, 'support': 2152.0} | {'precision': 0.9339356295878035, 'recall': 0.8963797962280512, 'f1-score': 0.9147724130302528, 'support': 9226.0} | {'precision': 0.8678364455891823, 'recall': 0.8930671746873188, 'f1-score': 0.8802710535983997, 'support': 12073.0} | 0.8429 | {'precision': 0.7990725246500678, 'recall': 0.796993429362354, 'f1-score': 0.7978907090705227, 'support': 27619.0} | {'precision': 0.8439798747607198, 'recall': 0.8429342119555379, 'f1-score': 0.8432489450792122, 'support': 27619.0} | |
|
| No log | 8.0 | 328 | 0.4908 | {'precision': 0.5787159190853123, 'recall': 0.6314779270633397, 'f1-score': 0.6039467645709041, 'support': 4168.0} | {'precision': 0.7667689609820254, 'recall': 0.8127323420074349, 'f1-score': 0.7890818858560795, 'support': 2152.0} | {'precision': 0.9358756100329134, 'recall': 0.8937784522003035, 'f1-score': 0.9143427399234906, 'support': 9226.0} | {'precision': 0.878620919943234, 'recall': 0.8717800049697673, 'f1-score': 0.8751870946283054, 'support': 12073.0} | 0.8383 | {'precision': 0.7899953525108712, 'recall': 0.8024421815602114, 'f1-score': 0.7956396212446948, 'support': 27619.0} | {'precision': 0.8437725297591957, 'recall': 0.8382635142474384, 'f1-score': 0.8406247237436355, 'support': 27619.0} | |
|
| No log | 9.0 | 369 | 0.4966 | {'precision': 0.5958668197474167, 'recall': 0.6226007677543186, 'f1-score': 0.6089405139035551, 'support': 4168.0} | {'precision': 0.7666666666666667, 'recall': 0.8122676579925651, 'f1-score': 0.7888086642599278, 'support': 2152.0} | {'precision': 0.934957507082153, 'recall': 0.8943203988727509, 'f1-score': 0.9141875796354773, 'support': 9226.0} | {'precision': 0.8768813224771774, 'recall': 0.8831276401888511, 'f1-score': 0.8799933971607792, 'support': 12073.0} | 0.8420 | {'precision': 0.7935930789933534, 'recall': 0.8030791162021215, 'f1-score': 0.7979825387399347, 'support': 27619.0} | {'precision': 0.8452856996263733, 'recall': 0.8420290379811, 'f1-score': 0.843406176946174, 'support': 27619.0} | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.37.2 |
|
- Pytorch 2.2.0+cu121 |
|
- Datasets 2.17.0 |
|
- Tokenizers 0.15.2 |
|
|