longformer-spans / meta_data /README_s42_e10.md
Theoreticallyhugo's picture
Training in progress, epoch 1
49c7a0d verified
|
raw
history blame
9.94 kB
---
license: apache-2.0
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-spans
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: essays_su_g
type: essays_su_g
config: spans
split: train[80%:100%]
args: spans
metrics:
- name: Accuracy
type: accuracy
value: 0.939172308917774
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# longformer-spans
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2166
- B: {'precision': 0.8636788048552755, 'recall': 0.8868648130393096, 'f1-score': 0.8751182592242194, 'support': 1043.0}
- I: {'precision': 0.948943661971831, 'recall': 0.9630547550432277, 'f1-score': 0.9559471365638768, 'support': 17350.0}
- O: {'precision': 0.9289709172259508, 'recall': 0.9001734229351832, 'f1-score': 0.9143454805680943, 'support': 9226.0}
- Accuracy: 0.9392
- Macro avg: {'precision': 0.9138644613510191, 'recall': 0.9166976636725735, 'f1-score': 0.9151369587853968, 'support': 27619.0}
- Weighted avg: {'precision': 0.9390519284189125, 'recall': 0.939172308917774, 'f1-score': 0.9389978843359775, 'support': 27619.0}
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
### Training results
| Training Loss | Epoch | Step | Validation Loss | B | I | O | Accuracy | Macro avg | Weighted avg |
|:-------------:|:-----:|:----:|:---------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log | 1.0 | 41 | 0.2992 | {'precision': 0.7993138936535163, 'recall': 0.4467881112176414, 'f1-score': 0.5731857318573186, 'support': 1043.0} | {'precision': 0.8829387840233601, 'recall': 0.9759654178674352, 'f1-score': 0.9271243977222952, 'support': 17350.0} | {'precision': 0.9361160600661746, 'recall': 0.7973119445046607, 'f1-score': 0.8611566377897449, 'support': 9226.0} | 0.8963 | {'precision': 0.8727895792476836, 'recall': 0.7400218245299124, 'f1-score': 0.7871555891231196, 'support': 27619.0} | {'precision': 0.8975444101544748, 'recall': 0.8963032694883957, 'f1-score': 0.8917220811418657, 'support': 27619.0} |
| No log | 2.0 | 82 | 0.2008 | {'precision': 0.7899231426131511, 'recall': 0.8868648130393096, 'f1-score': 0.8355916892502258, 'support': 1043.0} | {'precision': 0.9329899761865205, 'recall': 0.9710086455331413, 'f1-score': 0.951619736210354, 'support': 17350.0} | {'precision': 0.9478012155881301, 'recall': 0.862020377194884, 'f1-score': 0.9028779020264517, 'support': 9226.0} | 0.9314 | {'precision': 0.8902381114626006, 'recall': 0.9066312785891116, 'f1-score': 0.8966964424956773, 'support': 27619.0} | {'precision': 0.9325348470110335, 'recall': 0.9314240196965857, 'f1-score': 0.9309560838275704, 'support': 27619.0} |
| No log | 3.0 | 123 | 0.1754 | {'precision': 0.8574144486692015, 'recall': 0.8648130393096836, 'f1-score': 0.8610978520286395, 'support': 1043.0} | {'precision': 0.9637472869126532, 'recall': 0.9469164265129683, 'f1-score': 0.9552577259644737, 'support': 17350.0} | {'precision': 0.9027310924369748, 'recall': 0.9314979406026447, 'f1-score': 0.916888936306412, 'support': 9226.0} | 0.9387 | {'precision': 0.9079642760062764, 'recall': 0.9144091354750987, 'f1-score': 0.9110815047665084, 'support': 27619.0} | {'precision': 0.9393495693805003, 'recall': 0.9386654114920888, 'f1-score': 0.9388849680116025, 'support': 27619.0} |
| No log | 4.0 | 164 | 0.1737 | {'precision': 0.8583732057416268, 'recall': 0.8600191754554171, 'f1-score': 0.8591954022988505, 'support': 1043.0} | {'precision': 0.9443852068017284, 'recall': 0.9699135446685879, 'f1-score': 0.9569791577810003, 'support': 17350.0} | {'precision': 0.9394631639063392, 'recall': 0.8915022761760243, 'f1-score': 0.9148545687114177, 'support': 9226.0} | 0.9396 | {'precision': 0.9140738588165648, 'recall': 0.9071449987666765, 'f1-score': 0.9103430429304228, 'support': 27619.0} | {'precision': 0.9394928759838658, 'recall': 0.9395705854665267, 'f1-score': 0.939214940549245, 'support': 27619.0} |
| No log | 5.0 | 205 | 0.2081 | {'precision': 0.8590225563909775, 'recall': 0.8763183125599233, 'f1-score': 0.8675842429995253, 'support': 1043.0} | {'precision': 0.9336273428886439, 'recall': 0.9761383285302594, 'f1-score': 0.9544096928712315, 'support': 17350.0} | {'precision': 0.9511586452762923, 'recall': 0.8675482332538478, 'f1-score': 0.9074315514993481, 'support': 9226.0} | 0.9361 | {'precision': 0.9146028481853046, 'recall': 0.9066682914480101, 'f1-score': 0.9098084957900351, 'support': 27619.0} | {'precision': 0.9366662292897221, 'recall': 0.9360947174046852, 'f1-score': 0.9354379967014502, 'support': 27619.0} |
| No log | 6.0 | 246 | 0.1913 | {'precision': 0.8325991189427313, 'recall': 0.9060402684563759, 'f1-score': 0.8677685950413224, 'support': 1043.0} | {'precision': 0.935986255057363, 'recall': 0.973371757925072, 'f1-score': 0.9543129997457125, 'support': 17350.0} | {'precision': 0.9491766378391185, 'recall': 0.8684153479297637, 'f1-score': 0.9070017546838739, 'support': 9226.0} | 0.9358 | {'precision': 0.9059206706130709, 'recall': 0.9159424581037373, 'f1-score': 0.9096944498236362, 'support': 27619.0} | {'precision': 0.9364881446470265, 'recall': 0.9357688547738875, 'f1-score': 0.9352406451692542, 'support': 27619.0} |
| No log | 7.0 | 287 | 0.1970 | {'precision': 0.8484848484848485, 'recall': 0.8859060402684564, 'f1-score': 0.8667917448405252, 'support': 1043.0} | {'precision': 0.9405801971326165, 'recall': 0.9680115273775216, 'f1-score': 0.9540987331704823, 'support': 17350.0} | {'precision': 0.9371685496887249, 'recall': 0.8810969000650336, 'f1-score': 0.9082681564245809, 'support': 9226.0} | 0.9359 | {'precision': 0.9087445317687299, 'recall': 0.9116714892370039, 'f1-score': 0.9097195448118628, 'support': 27619.0} | {'precision': 0.9359626762970696, 'recall': 0.9358774756508201, 'f1-score': 0.9354921909391984, 'support': 27619.0} |
| No log | 8.0 | 328 | 0.2042 | {'precision': 0.8507734303912647, 'recall': 0.8964525407478428, 'f1-score': 0.8730158730158729, 'support': 1043.0} | {'precision': 0.9413907099232364, 'recall': 0.96835734870317, 'f1-score': 0.9546836378100406, 'support': 17350.0} | {'precision': 0.9384296091317883, 'recall': 0.8821807934099285, 'f1-score': 0.9094362813565005, 'support': 9226.0} | 0.9369 | {'precision': 0.9101979164820965, 'recall': 0.9156635609536471, 'f1-score': 0.9123785973941381, 'support': 27619.0} | {'precision': 0.9369795097185314, 'recall': 0.936855063543213, 'f1-score': 0.9364848764747034, 'support': 27619.0} |
| No log | 9.0 | 369 | 0.2107 | {'precision': 0.8516483516483516, 'recall': 0.8916586768935763, 'f1-score': 0.8711943793911008, 'support': 1043.0} | {'precision': 0.9517556380245504, 'recall': 0.960806916426513, 'f1-score': 0.9562598594579091, 'support': 17350.0} | {'precision': 0.9260985352862849, 'recall': 0.9046173856492521, 'f1-score': 0.9152319333260226, 'support': 9226.0} | 0.9394 | {'precision': 0.9098341749863957, 'recall': 0.9190276596564471, 'f1-score': 0.9142287240583441, 'support': 27619.0} | {'precision': 0.9394045634181702, 'recall': 0.9394257576306166, 'f1-score': 0.9393422685892149, 'support': 27619.0} |
| No log | 10.0 | 410 | 0.2166 | {'precision': 0.8636788048552755, 'recall': 0.8868648130393096, 'f1-score': 0.8751182592242194, 'support': 1043.0} | {'precision': 0.948943661971831, 'recall': 0.9630547550432277, 'f1-score': 0.9559471365638768, 'support': 17350.0} | {'precision': 0.9289709172259508, 'recall': 0.9001734229351832, 'f1-score': 0.9143454805680943, 'support': 9226.0} | 0.9392 | {'precision': 0.9138644613510191, 'recall': 0.9166976636725735, 'f1-score': 0.9151369587853968, 'support': 27619.0} | {'precision': 0.9390519284189125, 'recall': 0.939172308917774, 'f1-score': 0.9389978843359775, 'support': 27619.0} |
### Framework versions
- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2