|
--- |
|
license: apache-2.0 |
|
base_model: allenai/longformer-base-4096 |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- essays_su_g |
|
metrics: |
|
- accuracy |
|
model-index: |
|
- name: longformer-simple |
|
results: |
|
- task: |
|
name: Token Classification |
|
type: token-classification |
|
dataset: |
|
name: essays_su_g |
|
type: essays_su_g |
|
config: simple |
|
split: train[80%:100%] |
|
args: simple |
|
metrics: |
|
- name: Accuracy |
|
type: accuracy |
|
value: 0.8421376588580325 |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# longformer-simple |
|
|
|
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 0.4683 |
|
- Claim: {'precision': 0.5930310475765022, 'recall': 0.636996161228407, 'f1-score': 0.6142278773857722, 'support': 4168.0} |
|
- Majorclaim: {'precision': 0.7712014134275619, 'recall': 0.8113382899628253, 'f1-score': 0.7907608695652174, 'support': 2152.0} |
|
- O: {'precision': 0.9350634632819583, 'recall': 0.8943203988727509, 'f1-score': 0.9142382271468144, 'support': 9226.0} |
|
- Premise: {'precision': 0.8799568607930978, 'recall': 0.8785720202103868, 'f1-score': 0.8792638952211216, 'support': 12073.0} |
|
- Accuracy: 0.8421 |
|
- Macro avg: {'precision': 0.7948131962697801, 'recall': 0.8053067175685925, 'f1-score': 0.7996227173297313, 'support': 27619.0} |
|
- Weighted avg: {'precision': 0.846590880936652, 'recall': 0.8421376588580325, 'f1-score': 0.8440542407367884, 'support': 27619.0} |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 8 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 8 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Claim | Majorclaim | O | Premise | Accuracy | Macro avg | Weighted avg | |
|
|:-------------:|:-----:|:----:|:---------------:|:--------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:| |
|
| No log | 1.0 | 41 | 0.5650 | {'precision': 0.5066032752245113, 'recall': 0.23008637236084453, 'f1-score': 0.31644943078699883, 'support': 4168.0} | {'precision': 0.5405718701700154, 'recall': 0.650092936802974, 'f1-score': 0.5902953586497891, 'support': 2152.0} | {'precision': 0.9117542823390431, 'recall': 0.8365488835898548, 'f1-score': 0.8725340568650726, 'support': 9226.0} | {'precision': 0.7800040891433244, 'recall': 0.9479831027913526, 'f1-score': 0.8558289089957376, 'support': 12073.0} | 0.7792 | {'precision': 0.6847333792192236, 'recall': 0.6661778238862565, 'f1-score': 0.6587769388243996, 'support': 27619.0} | {'precision': 0.7640996231879866, 'recall': 0.7792099641551106, 'f1-score': 0.7593214260573249, 'support': 27619.0} | |
|
| No log | 2.0 | 82 | 0.4458 | {'precision': 0.5965486462362393, 'recall': 0.4810460652591171, 'f1-score': 0.5326072519590915, 'support': 4168.0} | {'precision': 0.7099605089951733, 'recall': 0.7518587360594795, 'f1-score': 0.7303091852854885, 'support': 2152.0} | {'precision': 0.9064716795809232, 'recall': 0.9002818122696726, 'f1-score': 0.9033661428027624, 'support': 9226.0} | {'precision': 0.848392634207241, 'recall': 0.9006046550153235, 'f1-score': 0.8737193137530637, 'support': 12073.0} | 0.8256 | {'precision': 0.7653433672548942, 'recall': 0.7584478171508982, 'f1-score': 0.7600004734501016, 'support': 27619.0} | {'precision': 0.8190014758487952, 'recall': 0.8255910786053079, 'f1-score': 0.8209711322400842, 'support': 27619.0} | |
|
| No log | 3.0 | 123 | 0.4332 | {'precision': 0.5755813953488372, 'recall': 0.5700575815738963, 'f1-score': 0.5728061716489875, 'support': 4168.0} | {'precision': 0.6984323432343235, 'recall': 0.7867100371747212, 'f1-score': 0.7399475524475525, 'support': 2152.0} | {'precision': 0.9506590881605999, 'recall': 0.8520485584218513, 'f1-score': 0.8986567590740212, 'support': 9226.0} | {'precision': 0.853180184403813, 'recall': 0.9044148099064027, 'f1-score': 0.8780507418278317, 'support': 12073.0} | 0.8273 | {'precision': 0.7694632527868934, 'recall': 0.7783077467692179, 'f1-score': 0.7723653062495982, 'support': 27619.0} | {'precision': 0.8317924172537436, 'recall': 0.8272928056772512, 'f1-score': 0.8281088063146546, 'support': 27619.0} | |
|
| No log | 4.0 | 164 | 0.4213 | {'precision': 0.6149187998898982, 'recall': 0.5359884836852208, 'f1-score': 0.5727470837072169, 'support': 4168.0} | {'precision': 0.7890625, 'recall': 0.7509293680297398, 'f1-score': 0.7695238095238095, 'support': 2152.0} | {'precision': 0.90938406965495, 'recall': 0.9169737697810535, 'f1-score': 0.9131631496572938, 'support': 9226.0} | {'precision': 0.8585674713098536, 'recall': 0.898533918661476, 'f1-score': 0.8780961631860126, 'support': 12073.0} | 0.8385 | {'precision': 0.7929832102136755, 'recall': 0.7756063850393726, 'f1-score': 0.7833825515185833, 'support': 27619.0} | {'precision': 0.8333577090300708, 'recall': 0.8384807560013035, 'f1-score': 0.8352700416332903, 'support': 27619.0} | |
|
| No log | 5.0 | 205 | 0.4305 | {'precision': 0.5943820224719101, 'recall': 0.6345969289827256, 'f1-score': 0.613831515432815, 'support': 4168.0} | {'precision': 0.7417763157894737, 'recall': 0.8382899628252788, 'f1-score': 0.7870855148342057, 'support': 2152.0} | {'precision': 0.9332355926468929, 'recall': 0.8969217429004986, 'f1-score': 0.9147183993809761, 'support': 9226.0} | {'precision': 0.8866048862679022, 'recall': 0.8716971755156133, 'f1-score': 0.8790878336048114, 'support': 12073.0} | 0.8417 | {'precision': 0.7889997042940448, 'recall': 0.8103764525560291, 'f1-score': 0.7986808158132019, 'support': 27619.0} | {'precision': 0.8467974680804694, 'recall': 0.8417393823092798, 'f1-score': 0.8437914896284063, 'support': 27619.0} | |
|
| No log | 6.0 | 246 | 0.4474 | {'precision': 0.6243768693918246, 'recall': 0.6010076775431862, 'f1-score': 0.6124694376528118, 'support': 4168.0} | {'precision': 0.782648401826484, 'recall': 0.7964684014869888, 'f1-score': 0.7894979272224781, 'support': 2152.0} | {'precision': 0.9337994812225104, 'recall': 0.897463689572946, 'f1-score': 0.9152710993201791, 'support': 9226.0} | {'precision': 0.8685258964143426, 'recall': 0.9028410502774786, 'f1-score': 0.8853510945051374, 'support': 12073.0} | 0.8472 | {'precision': 0.8023376622137904, 'recall': 0.7994452047201499, 'f1-score': 0.8006473896751517, 'support': 27619.0} | {'precision': 0.8467942109969572, 'recall': 0.8472066331148846, 'f1-score': 0.8466963714040403, 'support': 27619.0} | |
|
| No log | 7.0 | 287 | 0.4659 | {'precision': 0.6118966357874208, 'recall': 0.6022072936660269, 'f1-score': 0.6070133010882709, 'support': 4168.0} | {'precision': 0.7918010133578995, 'recall': 0.7987918215613383, 'f1-score': 0.7952810548230396, 'support': 2152.0} | {'precision': 0.9346397825101949, 'recall': 0.8943203988727509, 'f1-score': 0.9140356707654813, 'support': 9226.0} | {'precision': 0.8677104968844863, 'recall': 0.8996935310196306, 'f1-score': 0.8834126306372251, 'support': 12073.0} | 0.8451 | {'precision': 0.8015119821350003, 'recall': 0.7987532612799366, 'f1-score': 0.7999356643285043, 'support': 27619.0} | {'precision': 0.845548224810226, 'recall': 0.8451428364531663, 'f1-score': 0.845063545279722, 'support': 27619.0} | |
|
| No log | 8.0 | 328 | 0.4683 | {'precision': 0.5930310475765022, 'recall': 0.636996161228407, 'f1-score': 0.6142278773857722, 'support': 4168.0} | {'precision': 0.7712014134275619, 'recall': 0.8113382899628253, 'f1-score': 0.7907608695652174, 'support': 2152.0} | {'precision': 0.9350634632819583, 'recall': 0.8943203988727509, 'f1-score': 0.9142382271468144, 'support': 9226.0} | {'precision': 0.8799568607930978, 'recall': 0.8785720202103868, 'f1-score': 0.8792638952211216, 'support': 12073.0} | 0.8421 | {'precision': 0.7948131962697801, 'recall': 0.8053067175685925, 'f1-score': 0.7996227173297313, 'support': 27619.0} | {'precision': 0.846590880936652, 'recall': 0.8421376588580325, 'f1-score': 0.8440542407367884, 'support': 27619.0} | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.37.2 |
|
- Pytorch 2.2.0+cu121 |
|
- Datasets 2.17.0 |
|
- Tokenizers 0.15.2 |
|
|