|
--- |
|
license: apache-2.0 |
|
base_model: allenai/longformer-base-4096 |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- essays_su_g |
|
metrics: |
|
- accuracy |
|
model-index: |
|
- name: longformer-simple |
|
results: |
|
- task: |
|
name: Token Classification |
|
type: token-classification |
|
dataset: |
|
name: essays_su_g |
|
type: essays_su_g |
|
config: simple |
|
split: train[80%:100%] |
|
args: simple |
|
metrics: |
|
- name: Accuracy |
|
type: accuracy |
|
value: 0.8427169702016728 |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# longformer-simple |
|
|
|
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 0.5229 |
|
- Claim: {'precision': 0.5976262508727019, 'recall': 0.6161228406909789, 'f1-score': 0.6067336089781453, 'support': 4168.0} |
|
- Majorclaim: {'precision': 0.7766384306732055, 'recall': 0.8094795539033457, 'f1-score': 0.7927189988623435, 'support': 2152.0} |
|
- O: {'precision': 0.9332209106239461, 'recall': 0.8997398655972252, 'f1-score': 0.9161746040505491, 'support': 9226.0} |
|
- Premise: {'precision': 0.8752462245567958, 'recall': 0.8832932990971589, 'f1-score': 0.8792513501257369, 'support': 12073.0} |
|
- Accuracy: 0.8427 |
|
- Macro avg: {'precision': 0.7956829541816623, 'recall': 0.8021588898221772, 'f1-score': 0.7987196405041936, 'support': 27619.0} |
|
- Weighted avg: {'precision': 0.8450333432396857, 'recall': 0.8427169702016728, 'f1-score': 0.8437172024624736, 'support': 27619.0} |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 8 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 10 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Claim | Majorclaim | O | Premise | Accuracy | Macro avg | Weighted avg | |
|
|:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:| |
|
| No log | 1.0 | 41 | 0.5686 | {'precision': 0.5015495867768595, 'recall': 0.2329654510556622, 'f1-score': 0.31815203145478377, 'support': 4168.0} | {'precision': 0.5294117647058824, 'recall': 0.6565985130111525, 'f1-score': 0.5861854387056629, 'support': 2152.0} | {'precision': 0.9169171576289528, 'recall': 0.826577064816822, 'f1-score': 0.8694066009234452, 'support': 9226.0} | {'precision': 0.778798394230115, 'recall': 0.9480659322455065, 'f1-score': 0.8551363466567052, 'support': 12073.0} | 0.7769 | {'precision': 0.6816692258354524, 'recall': 0.6660517402822859, 'f1-score': 0.6572201044351493, 'support': 27619.0} | {'precision': 0.7636649952988127, 'recall': 0.7768565118215721, 'f1-score': 0.7579106826642613, 'support': 27619.0} | |
|
| No log | 2.0 | 82 | 0.4450 | {'precision': 0.5915697674418605, 'recall': 0.4882437619961612, 'f1-score': 0.5349631966351209, 'support': 4168.0} | {'precision': 0.7189862160960427, 'recall': 0.7513940520446096, 'f1-score': 0.7348329925017042, 'support': 2152.0} | {'precision': 0.9161246916348957, 'recall': 0.8855408627791025, 'f1-score': 0.900573192239859, 'support': 9226.0} | {'precision': 0.8421457116507839, 'recall': 0.9076451586184047, 'f1-score': 0.873669523619693, 'support': 12073.0} | 0.8248 | {'precision': 0.7672065967058957, 'recall': 0.7582059588595695, 'f1-score': 0.7610097262490942, 'support': 27619.0} | {'precision': 0.8194472178398864, 'recall': 0.8247945255078026, 'f1-score': 0.8207244155727703, 'support': 27619.0} | |
|
| No log | 3.0 | 123 | 0.4291 | {'precision': 0.5668412662263721, 'recall': 0.597168905950096, 'f1-score': 0.5816100011683608, 'support': 4168.0} | {'precision': 0.7096088435374149, 'recall': 0.7755576208178439, 'f1-score': 0.7411190053285968, 'support': 2152.0} | {'precision': 0.9522893882946761, 'recall': 0.858877086494689, 'f1-score': 0.9031743317946088, 'support': 9226.0} | {'precision': 0.8618876941457586, 'recall': 0.8962975233993208, 'f1-score': 0.878755887607601, 'support': 12073.0} | 0.8292 | {'precision': 0.7726567980510554, 'recall': 0.7819752841654874, 'f1-score': 0.7761648064747918, 'support': 27619.0} | {'precision': 0.8356951611844187, 'recall': 0.829247981462037, 'f1-score': 0.8313459864788912, 'support': 27619.0} | |
|
| No log | 4.0 | 164 | 0.4224 | {'precision': 0.635280095351609, 'recall': 0.5115163147792706, 'f1-score': 0.5667198298777245, 'support': 4168.0} | {'precision': 0.7965432098765433, 'recall': 0.7495353159851301, 'f1-score': 0.7723246349054346, 'support': 2152.0} | {'precision': 0.9076593465452598, 'recall': 0.9183828311294169, 'f1-score': 0.9129896018533484, 'support': 9226.0} | {'precision': 0.8526699217236302, 'recall': 0.9112896546011762, 'f1-score': 0.8810057655349135, 'support': 12073.0} | 0.8407 | {'precision': 0.7980381433742605, 'recall': 0.7726810291237485, 'f1-score': 0.7832599580428552, 'support': 27619.0} | {'precision': 0.8338592100103474, 'recall': 0.8407255874579094, 'f1-score': 0.8357925898565789, 'support': 27619.0} | |
|
| No log | 5.0 | 205 | 0.4366 | {'precision': 0.5841626085115392, 'recall': 0.6619481765834933, 'f1-score': 0.6206276009447756, 'support': 4168.0} | {'precision': 0.7325534489713594, 'recall': 0.8438661710037175, 'f1-score': 0.7842798531634635, 'support': 2152.0} | {'precision': 0.9277765412864456, 'recall': 0.9036418816388467, 'f1-score': 0.9155501866900944, 'support': 9226.0} | {'precision': 0.8999212667308197, 'recall': 0.8520665948811398, 'f1-score': 0.8753403675970047, 'support': 12073.0} | 0.8400 | {'precision': 0.786103466375041, 'recall': 0.8153807060267992, 'f1-score': 0.7989495020988346, 'support': 27619.0} | {'precision': 0.848534001868728, 'recall': 0.8399652413193816, 'f1-score': 0.8432382188039772, 'support': 27619.0} | |
|
| No log | 6.0 | 246 | 0.4462 | {'precision': 0.6034715960324617, 'recall': 0.642274472168906, 'f1-score': 0.6222687122268712, 'support': 4168.0} | {'precision': 0.7745405647691618, 'recall': 0.8029739776951673, 'f1-score': 0.7885010266940452, 'support': 2152.0} | {'precision': 0.9244103126714207, 'recall': 0.913288532408411, 'f1-score': 0.9188157679515838, 'support': 9226.0} | {'precision': 0.8895835093351356, 'recall': 0.8721941522405368, 'f1-score': 0.8808030112923464, 'support': 12073.0} | 0.8458 | {'precision': 0.7980014957020449, 'recall': 0.8076827836282554, 'f1-score': 0.8025971295412117, 'support': 27619.0} | {'precision': 0.8490760766340619, 'recall': 0.8458307686737391, 'f1-score': 0.8472935020261775, 'support': 27619.0} | |
|
| No log | 7.0 | 287 | 0.4800 | {'precision': 0.6023660067600193, 'recall': 0.5986084452975048, 'f1-score': 0.6004813477737665, 'support': 4168.0} | {'precision': 0.7868324125230203, 'recall': 0.7941449814126395, 'f1-score': 0.7904717853839038, 'support': 2152.0} | {'precision': 0.9340485601355166, 'recall': 0.8964881855625406, 'f1-score': 0.9148830263812843, 'support': 9226.0} | {'precision': 0.8673895582329317, 'recall': 0.894475275407935, 'f1-score': 0.8807242180809852, 'support': 12073.0} | 0.8427 | {'precision': 0.797659134412872, 'recall': 0.7959292219201549, 'f1-score': 0.796640094404985, 'support': 27619.0} | {'precision': 0.8433850255361078, 'recall': 0.8426807632426953, 'f1-score': 0.8428109571654543, 'support': 27619.0} | |
|
| No log | 8.0 | 328 | 0.5120 | {'precision': 0.5682210708117443, 'recall': 0.6314779270633397, 'f1-score': 0.5981818181818181, 'support': 4168.0} | {'precision': 0.7462057335581788, 'recall': 0.8224907063197026, 'f1-score': 0.7824933687002653, 'support': 2152.0} | {'precision': 0.9381360777587193, 'recall': 0.8892261001517451, 'f1-score': 0.9130265427633409, 'support': 9226.0} | {'precision': 0.8802864363942713, 'recall': 0.865484966454071, 'f1-score': 0.8728229545169779, 'support': 12073.0} | 0.8348 | {'precision': 0.7832123296307284, 'recall': 0.8021699249972145, 'f1-score': 0.7916311710406005, 'support': 27619.0} | {'precision': 0.8420696535627841, 'recall': 0.8347514392266193, 'f1-score': 0.8377682740520238, 'support': 27619.0} | |
|
| No log | 9.0 | 369 | 0.5199 | {'precision': 0.6004990925589837, 'recall': 0.6350767754318618, 'f1-score': 0.617304104477612, 'support': 4168.0} | {'precision': 0.7618432385874246, 'recall': 0.8220260223048327, 'f1-score': 0.7907912382655341, 'support': 2152.0} | {'precision': 0.9352794749943426, 'recall': 0.8959462388900932, 'f1-score': 0.9151904340124003, 'support': 9226.0} | {'precision': 0.88083976433491, 'recall': 0.879234655843618, 'f1-score': 0.8800364781959874, 'support': 12073.0} | 0.8435 | {'precision': 0.7946153926189152, 'recall': 0.8080709231176013, 'f1-score': 0.8008305637378834, 'support': 27619.0} | {'precision': 0.8474468220550765, 'recall': 0.8435135232991781, 'f1-score': 0.8451766391856577, 'support': 27619.0} | |
|
| No log | 10.0 | 410 | 0.5229 | {'precision': 0.5976262508727019, 'recall': 0.6161228406909789, 'f1-score': 0.6067336089781453, 'support': 4168.0} | {'precision': 0.7766384306732055, 'recall': 0.8094795539033457, 'f1-score': 0.7927189988623435, 'support': 2152.0} | {'precision': 0.9332209106239461, 'recall': 0.8997398655972252, 'f1-score': 0.9161746040505491, 'support': 9226.0} | {'precision': 0.8752462245567958, 'recall': 0.8832932990971589, 'f1-score': 0.8792513501257369, 'support': 12073.0} | 0.8427 | {'precision': 0.7956829541816623, 'recall': 0.8021588898221772, 'f1-score': 0.7987196405041936, 'support': 27619.0} | {'precision': 0.8450333432396857, 'recall': 0.8427169702016728, 'f1-score': 0.8437172024624736, 'support': 27619.0} | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.37.2 |
|
- Pytorch 2.2.0+cu121 |
|
- Datasets 2.17.0 |
|
- Tokenizers 0.15.2 |
|
|