longformer-simple / meta_data /README_s42_e9.md
Theoreticallyhugo's picture
Training in progress, epoch 1
55a8b23 verified
|
raw
history blame
10.7 kB
---
license: apache-2.0
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-simple
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: essays_su_g
type: essays_su_g
config: simple
split: train[80%:100%]
args: simple
metrics:
- name: Accuracy
type: accuracy
value: 0.8420290379811
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# longformer-simple
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4966
- Claim: {'precision': 0.5958668197474167, 'recall': 0.6226007677543186, 'f1-score': 0.6089405139035551, 'support': 4168.0}
- Majorclaim: {'precision': 0.7666666666666667, 'recall': 0.8122676579925651, 'f1-score': 0.7888086642599278, 'support': 2152.0}
- O: {'precision': 0.934957507082153, 'recall': 0.8943203988727509, 'f1-score': 0.9141875796354773, 'support': 9226.0}
- Premise: {'precision': 0.8768813224771774, 'recall': 0.8831276401888511, 'f1-score': 0.8799933971607792, 'support': 12073.0}
- Accuracy: 0.8420
- Macro avg: {'precision': 0.7935930789933534, 'recall': 0.8030791162021215, 'f1-score': 0.7979825387399347, 'support': 27619.0}
- Weighted avg: {'precision': 0.8452856996263733, 'recall': 0.8420290379811, 'f1-score': 0.843406176946174, 'support': 27619.0}
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 9
### Training results
| Training Loss | Epoch | Step | Validation Loss | Claim | Majorclaim | O | Premise | Accuracy | Macro avg | Weighted avg |
|:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log | 1.0 | 41 | 0.5668 | {'precision': 0.5033522434244456, 'recall': 0.23416506717850288, 'f1-score': 0.3196332077943344, 'support': 4168.0} | {'precision': 0.5278396436525612, 'recall': 0.6607806691449815, 'f1-score': 0.58687577383409, 'support': 2152.0} | {'precision': 0.9158374295648004, 'recall': 0.8279861261651853, 'f1-score': 0.8696988671941708, 'support': 9226.0} | {'precision': 0.7802663024923182, 'recall': 0.9464921726165825, 'f1-score': 0.8553783965865709, 'support': 12073.0} | 0.7771 | {'precision': 0.6818239047835313, 'recall': 0.6673560087763131, 'f1-score': 0.6578965613522916, 'support': 27619.0} | {'precision': 0.76409552333133, 'recall': 0.7771461674933923, 'f1-score': 0.7583914336543988, 'support': 27619.0} |
| No log | 2.0 | 82 | 0.4435 | {'precision': 0.5961424332344214, 'recall': 0.4820057581573896, 'f1-score': 0.5330326346511012, 'support': 4168.0} | {'precision': 0.7147918511957484, 'recall': 0.75, 'f1-score': 0.7319727891156461, 'support': 2152.0} | {'precision': 0.9080421885299934, 'recall': 0.8958378495556037, 'f1-score': 0.9018987341772151, 'support': 9226.0} | {'precision': 0.8463030491116456, 'recall': 0.9035036859107098, 'f1-score': 0.8739684320166653, 'support': 12073.0} | 0.8254 | {'precision': 0.7663198805179522, 'recall': 0.7578368234059257, 'f1-score': 0.7602181474901569, 'support': 27619.0} | {'precision': 0.8189278275389019, 'recall': 0.8253738368514428, 'f1-score': 0.8207836657612095, 'support': 27619.0} |
| No log | 3.0 | 123 | 0.4314 | {'precision': 0.573660177841865, 'recall': 0.5726967370441459, 'f1-score': 0.5731780525873454, 'support': 4168.0} | {'precision': 0.6975589573851882, 'recall': 0.783457249070632, 'f1-score': 0.7380170715692711, 'support': 2152.0} | {'precision': 0.9517867958812841, 'recall': 0.8516150010838933, 'f1-score': 0.8989188261541102, 'support': 9226.0} | {'precision': 0.8546066009698108, 'recall': 0.9050774455396339, 'f1-score': 0.8791182267991471, 'support': 12073.0} | 0.8276 | {'precision': 0.769403133019537, 'recall': 0.7782116081845762, 'f1-score': 0.7723080442774685, 'support': 27619.0} | {'precision': 0.8324346634507792, 'recall': 0.8275824613490713, 'f1-score': 0.8285686774845233, 'support': 27619.0} |
| No log | 4.0 | 164 | 0.4209 | {'precision': 0.6254607314998583, 'recall': 0.5292706333973128, 'f1-score': 0.5733593242365173, 'support': 4168.0} | {'precision': 0.7988077496274217, 'recall': 0.7472118959107806, 'f1-score': 0.7721488595438176, 'support': 2152.0} | {'precision': 0.9066852367688022, 'recall': 0.917298937784522, 'f1-score': 0.9119612068965518, 'support': 9226.0} | {'precision': 0.8564927422518634, 'recall': 0.904166321543941, 'f1-score': 0.8796841002498186, 'support': 12073.0} | 0.8397 | {'precision': 0.7968616150369864, 'recall': 0.7744869471591391, 'f1-score': 0.7842883727316763, 'support': 27619.0} | {'precision': 0.8338994705719012, 'recall': 0.8397479995655165, 'f1-score': 0.83585959833085, 'support': 27619.0} |
| No log | 5.0 | 205 | 0.4309 | {'precision': 0.5909090909090909, 'recall': 0.6487523992322457, 'f1-score': 0.6184812442817932, 'support': 4168.0} | {'precision': 0.7423312883435583, 'recall': 0.8434014869888475, 'f1-score': 0.7896454209266913, 'support': 2152.0} | {'precision': 0.9306642809214941, 'recall': 0.9020160416215044, 'f1-score': 0.916116248348745, 'support': 9226.0} | {'precision': 0.8933596431022649, 'recall': 0.8625031061045307, 'f1-score': 0.8776602469552024, 'support': 12073.0} | 0.8420 | {'precision': 0.7893160758191021, 'recall': 0.8141682584867821, 'f1-score': 0.8004757901281081, 'support': 27619.0} | {'precision': 0.8484103570143661, 'recall': 0.841956624063145, 'f1-score': 0.8445355530886866, 'support': 27619.0} |
| No log | 6.0 | 246 | 0.4454 | {'precision': 0.6128436128436129, 'recall': 0.6365163147792706, 'f1-score': 0.6244556902436155, 'support': 4168.0} | {'precision': 0.7766771724448447, 'recall': 0.8015799256505576, 'f1-score': 0.7889320832380518, 'support': 2152.0} | {'precision': 0.9283890307538581, 'recall': 0.9063516150010839, 'f1-score': 0.9172379751000932, 'support': 9226.0} | {'precision': 0.8828552478859227, 'recall': 0.8820508572848504, 'f1-score': 0.8824528692769836, 'support': 12073.0} | 0.8468 | {'precision': 0.8001912659820596, 'recall': 0.8066246781789406, 'f1-score': 0.803269654464686, 'support': 27619.0} | {'precision': 0.8490448625545937, 'recall': 0.8468445635251095, 'f1-score': 0.8478512693840529, 'support': 27619.0} |
| No log | 7.0 | 287 | 0.4743 | {'precision': 0.6026775041836003, 'recall': 0.6048464491362764, 'f1-score': 0.6037600287390732, 'support': 4168.0} | {'precision': 0.7918405192396848, 'recall': 0.7936802973977695, 'f1-score': 0.7927593409143652, 'support': 2152.0} | {'precision': 0.9339356295878035, 'recall': 0.8963797962280512, 'f1-score': 0.9147724130302528, 'support': 9226.0} | {'precision': 0.8678364455891823, 'recall': 0.8930671746873188, 'f1-score': 0.8802710535983997, 'support': 12073.0} | 0.8429 | {'precision': 0.7990725246500678, 'recall': 0.796993429362354, 'f1-score': 0.7978907090705227, 'support': 27619.0} | {'precision': 0.8439798747607198, 'recall': 0.8429342119555379, 'f1-score': 0.8432489450792122, 'support': 27619.0} |
| No log | 8.0 | 328 | 0.4908 | {'precision': 0.5787159190853123, 'recall': 0.6314779270633397, 'f1-score': 0.6039467645709041, 'support': 4168.0} | {'precision': 0.7667689609820254, 'recall': 0.8127323420074349, 'f1-score': 0.7890818858560795, 'support': 2152.0} | {'precision': 0.9358756100329134, 'recall': 0.8937784522003035, 'f1-score': 0.9143427399234906, 'support': 9226.0} | {'precision': 0.878620919943234, 'recall': 0.8717800049697673, 'f1-score': 0.8751870946283054, 'support': 12073.0} | 0.8383 | {'precision': 0.7899953525108712, 'recall': 0.8024421815602114, 'f1-score': 0.7956396212446948, 'support': 27619.0} | {'precision': 0.8437725297591957, 'recall': 0.8382635142474384, 'f1-score': 0.8406247237436355, 'support': 27619.0} |
| No log | 9.0 | 369 | 0.4966 | {'precision': 0.5958668197474167, 'recall': 0.6226007677543186, 'f1-score': 0.6089405139035551, 'support': 4168.0} | {'precision': 0.7666666666666667, 'recall': 0.8122676579925651, 'f1-score': 0.7888086642599278, 'support': 2152.0} | {'precision': 0.934957507082153, 'recall': 0.8943203988727509, 'f1-score': 0.9141875796354773, 'support': 9226.0} | {'precision': 0.8768813224771774, 'recall': 0.8831276401888511, 'f1-score': 0.8799933971607792, 'support': 12073.0} | 0.8420 | {'precision': 0.7935930789933534, 'recall': 0.8030791162021215, 'f1-score': 0.7979825387399347, 'support': 27619.0} | {'precision': 0.8452856996263733, 'recall': 0.8420290379811, 'f1-score': 0.843406176946174, 'support': 27619.0} |
### Framework versions
- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2