longformer-simple / meta_data /README_s42_e5.md
Theoreticallyhugo's picture
Training in progress, epoch 1
55a8b23 verified
|
raw
history blame
No virus
7.64 kB
---
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-simple
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: essays_su_g
type: essays_su_g
config: simple
split: train[80%:100%]
args: simple
metrics:
- name: Accuracy
type: accuracy
value: 0.8379014446576633
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# longformer-simple
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4267
- Claim: {'precision': 0.6011011011011012, 'recall': 0.5762955854126679, 'f1-score': 0.58843704066634, 'support': 4168.0}
- Majorclaim: {'precision': 0.7353560893383903, 'recall': 0.8108736059479554, 'f1-score': 0.7712707182320443, 'support': 2152.0}
- O: {'precision': 0.9331677579589072, 'recall': 0.8959462388900932, 'f1-score': 0.9141782791417828, 'support': 9226.0}
- Premise: {'precision': 0.8658005164622337, 'recall': 0.8886772136171622, 'f1-score': 0.8770897200081749, 'support': 12073.0}
- Accuracy: 0.8379
- Macro avg: {'precision': 0.7838563662151581, 'recall': 0.7929481609669696, 'f1-score': 0.7877439395120855, 'support': 27619.0}
- Weighted avg: {'precision': 0.838194397473588, 'recall': 0.8379014446576633, 'f1-score': 0.8376730933108891, 'support': 27619.0}
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5
### Training results
| Training Loss | Epoch | Step | Validation Loss | Claim | Majorclaim | O | Premise | Accuracy | Macro avg | Weighted avg |
|:-------------:|:-----:|:----:|:---------------:|:--------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log | 1.0 | 41 | 0.6166 | {'precision': 0.4200196270853778, 'recall': 0.2053742802303263, 'f1-score': 0.27586206896551724, 'support': 4168.0} | {'precision': 0.6073394495412844, 'recall': 0.46143122676579923, 'f1-score': 0.524425666754687, 'support': 2152.0} | {'precision': 0.897315672254132, 'recall': 0.8297203555170172, 'f1-score': 0.8621951906290477, 'support': 9226.0} | {'precision': 0.7481024975673046, 'recall': 0.9551892653027416, 'f1-score': 0.8390570430733411, 'support': 12073.0} | 0.7616 | {'precision': 0.6681943116120247, 'recall': 0.612928781953971, 'f1-score': 0.6253849923556483, 'support': 27619.0} | {'precision': 0.7374674009360002, 'recall': 0.7616495890510157, 'f1-score': 0.7372788894627758, 'support': 27619.0} |
| No log | 2.0 | 82 | 0.4575 | {'precision': 0.5743048897411314, 'recall': 0.43114203454894434, 'f1-score': 0.49253117719610806, 'support': 4168.0} | {'precision': 0.7058560572194904, 'recall': 0.7337360594795539, 'f1-score': 0.7195260879471406, 'support': 2152.0} | {'precision': 0.9206993795826283, 'recall': 0.8846737481031867, 'f1-score': 0.9023271239843015, 'support': 9226.0} | {'precision': 0.8243949805796236, 'recall': 0.9141886854965626, 'f1-score': 0.8669730175562625, 'support': 12073.0} | 0.8174 | {'precision': 0.7563138267807185, 'recall': 0.7409351319070618, 'f1-score': 0.7453393516709531, 'support': 27619.0} | {'precision': 0.8095875336596003, 'recall': 0.8173720989174119, 'f1-score': 0.8107869718183696, 'support': 27619.0} |
| No log | 3.0 | 123 | 0.4417 | {'precision': 0.6082102988836874, 'recall': 0.4052303262955854, 'f1-score': 0.4863930885529157, 'support': 4168.0} | {'precision': 0.7309513560051657, 'recall': 0.7890334572490706, 'f1-score': 0.7588826815642457, 'support': 2152.0} | {'precision': 0.9306548632391329, 'recall': 0.8887925428137872, 'f1-score': 0.9092421134334979, 'support': 9226.0} | {'precision': 0.8175517945725124, 'recall': 0.9282696927027251, 'f1-score': 0.8693999456964432, 'support': 12073.0} | 0.8253 | {'precision': 0.7718420781751247, 'recall': 0.7528315047652921, 'f1-score': 0.7559794573117755, 'support': 27619.0} | {'precision': 0.8169938241061772, 'recall': 0.8253014229334878, 'f1-score': 0.8162980269649668, 'support': 27619.0} |
| No log | 4.0 | 164 | 0.4247 | {'precision': 0.5918674698795181, 'recall': 0.5657389635316699, 'f1-score': 0.5785083415112856, 'support': 4168.0} | {'precision': 0.7616387337057728, 'recall': 0.7602230483271375, 'f1-score': 0.7609302325581395, 'support': 2152.0} | {'precision': 0.918848167539267, 'recall': 0.9130717537394321, 'f1-score': 0.9159508535391975, 'support': 9226.0} | {'precision': 0.8669534864842926, 'recall': 0.8846185703636213, 'f1-score': 0.8756969498196131, 'support': 12073.0} | 0.8363 | {'precision': 0.7848269644022126, 'recall': 0.7809130839904652, 'f1-score': 0.782771594357059, 'support': 27619.0} | {'precision': 0.8345694197992249, 'recall': 0.8363083384626525, 'f1-score': 0.8353523472178203, 'support': 27619.0} |
| No log | 5.0 | 205 | 0.4267 | {'precision': 0.6011011011011012, 'recall': 0.5762955854126679, 'f1-score': 0.58843704066634, 'support': 4168.0} | {'precision': 0.7353560893383903, 'recall': 0.8108736059479554, 'f1-score': 0.7712707182320443, 'support': 2152.0} | {'precision': 0.9331677579589072, 'recall': 0.8959462388900932, 'f1-score': 0.9141782791417828, 'support': 9226.0} | {'precision': 0.8658005164622337, 'recall': 0.8886772136171622, 'f1-score': 0.8770897200081749, 'support': 12073.0} | 0.8379 | {'precision': 0.7838563662151581, 'recall': 0.7929481609669696, 'f1-score': 0.7877439395120855, 'support': 27619.0} | {'precision': 0.838194397473588, 'recall': 0.8379014446576633, 'f1-score': 0.8376730933108891, 'support': 27619.0} |
### Framework versions
- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2