longformer-spans / meta_data /README_s42_e16.md
Theoreticallyhugo's picture
trainer: training complete at 2024-03-02 12:46:14.069670.
9369722 verified
|
raw
history blame
No virus
13.8 kB
---
license: apache-2.0
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-spans
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: essays_su_g
type: essays_su_g
config: spans
split: train[80%:100%]
args: spans
metrics:
- name: Accuracy
type: accuracy
value: 0.9436981787899634
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# longformer-spans
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2879
- B: {'precision': 0.8652946679139383, 'recall': 0.8868648130393096, 'f1-score': 0.8759469696969697, 'support': 1043.0}
- I: {'precision': 0.9512374695588152, 'recall': 0.9680691642651297, 'f1-score': 0.9595795126688947, 'support': 17350.0}
- O: {'precision': 0.9381536039581694, 'recall': 0.9042922176457836, 'f1-score': 0.9209117500965837, 'support': 9226.0}
- Accuracy: 0.9437
- Macro avg: {'precision': 0.9182285804769742, 'recall': 0.9197420649834077, 'f1-score': 0.9188127441541494, 'support': 27619.0}
- Weighted avg: {'precision': 0.9436213326187679, 'recall': 0.9436981787899634, 'f1-score': 0.9435044368221277, 'support': 27619.0}
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 16
### Training results
| Training Loss | Epoch | Step | Validation Loss | B | I | O | Accuracy | Macro avg | Weighted avg |
|:-------------:|:-----:|:----:|:---------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log | 1.0 | 41 | 0.2947 | {'precision': 0.8076923076923077, 'recall': 0.4429530201342282, 'f1-score': 0.5721362229102167, 'support': 1043.0} | {'precision': 0.8850358282336942, 'recall': 0.9752737752161383, 'f1-score': 0.927966217883682, 'support': 17350.0} | {'precision': 0.9349142280524723, 'recall': 0.8033817472360719, 'f1-score': 0.864171621779177, 'support': 9226.0} | 0.8978 | {'precision': 0.8758807879928246, 'recall': 0.7405361808621462, 'f1-score': 0.7880913541910252, 'support': 27619.0} | {'precision': 0.8987766886849552, 'recall': 0.8977515478474963, 'f1-score': 0.8932184128068332, 'support': 27619.0} |
| No log | 2.0 | 82 | 0.1954 | {'precision': 0.7979274611398963, 'recall': 0.8859060402684564, 'f1-score': 0.8396183552930486, 'support': 1043.0} | {'precision': 0.9369951534733441, 'recall': 0.9694524495677234, 'f1-score': 0.9529475085691623, 'support': 17350.0} | {'precision': 0.9441833137485312, 'recall': 0.8709083026230219, 'f1-score': 0.9060667568786648, 'support': 9226.0} | 0.9334 | {'precision': 0.8930353094539237, 'recall': 0.9087555974864006, 'f1-score': 0.8995442069136251, 'support': 27619.0} | {'precision': 0.9341445927577168, 'recall': 0.9333791954813715, 'f1-score': 0.933007462877301, 'support': 27619.0} |
| No log | 3.0 | 123 | 0.1738 | {'precision': 0.856203007518797, 'recall': 0.8734419942473634, 'f1-score': 0.8647365923113433, 'support': 1043.0} | {'precision': 0.9658622719246616, 'recall': 0.945821325648415, 'f1-score': 0.9557367501456028, 'support': 17350.0} | {'precision': 0.9021432305279665, 'recall': 0.9352915673097767, 'f1-score': 0.9184183917833005, 'support': 9226.0} | 0.9396 | {'precision': 0.9080695033238083, 'recall': 0.9181849624018517, 'f1-score': 0.9129639114134155, 'support': 27619.0} | {'precision': 0.940436062116152, 'recall': 0.9395705854665267, 'f1-score': 0.9398342070096555, 'support': 27619.0} |
| No log | 4.0 | 164 | 0.1750 | {'precision': 0.8874239350912779, 'recall': 0.8389261744966443, 'f1-score': 0.862493839329719, 'support': 1043.0} | {'precision': 0.9537671232876712, 'recall': 0.9631123919308358, 'f1-score': 0.9584169773444221, 'support': 17350.0} | {'precision': 0.926259190167892, 'recall': 0.9149143724257534, 'f1-score': 0.9205518294345384, 'support': 9226.0} | 0.9423 | {'precision': 0.9224834161822804, 'recall': 0.9056509796177444, 'f1-score': 0.9138208820362266, 'support': 27619.0} | {'precision': 0.9420728499160095, 'recall': 0.9423223143488179, 'f1-score': 0.9421458709478862, 'support': 27619.0} |
| No log | 5.0 | 205 | 0.2035 | {'precision': 0.8457399103139014, 'recall': 0.9041227229146692, 'f1-score': 0.8739573679332716, 'support': 1043.0} | {'precision': 0.9367580161988239, 'recall': 0.9732564841498559, 'f1-score': 0.954658525554048, 'support': 17350.0} | {'precision': 0.948690728945506, 'recall': 0.8717754172989378, 'f1-score': 0.9086082241301401, 'support': 9226.0} | 0.9367 | {'precision': 0.9103962184860771, 'recall': 0.916384874787821, 'f1-score': 0.9124080392058199, 'support': 27619.0} | {'precision': 0.9373068891979518, 'recall': 0.9367464426662805, 'f1-score': 0.9362280469583188, 'support': 27619.0} |
| No log | 6.0 | 246 | 0.1896 | {'precision': 0.8579335793357934, 'recall': 0.8916586768935763, 'f1-score': 0.8744710860366715, 'support': 1043.0} | {'precision': 0.9394277427631212, 'recall': 0.970778097982709, 'f1-score': 0.9548456588905582, 'support': 17350.0} | {'precision': 0.941900999302812, 'recall': 0.8786039453717754, 'f1-score': 0.9091520861372813, 'support': 9226.0} | 0.9370 | {'precision': 0.9130874404672422, 'recall': 0.9136802400826869, 'f1-score': 0.9128229436881702, 'support': 27619.0} | {'precision': 0.9371763887090455, 'recall': 0.9369998913791231, 'f1-score': 0.9365466769683909, 'support': 27619.0} |
| No log | 7.0 | 287 | 0.1974 | {'precision': 0.854262144821265, 'recall': 0.8935762224352828, 'f1-score': 0.8734770384254921, 'support': 1043.0} | {'precision': 0.9436012321478577, 'recall': 0.9710662824207493, 'f1-score': 0.9571367703451216, 'support': 17350.0} | {'precision': 0.9436181252161882, 'recall': 0.8870583134619553, 'f1-score': 0.9144644952231968, 'support': 9226.0} | 0.9401 | {'precision': 0.9138271673951035, 'recall': 0.9172336061059957, 'f1-score': 0.9150261013312702, 'support': 27619.0} | {'precision': 0.9402330865729558, 'recall': 0.9400774828922119, 'f1-score': 0.9397229787282255, 'support': 27619.0} |
| No log | 8.0 | 328 | 0.2392 | {'precision': 0.851952770208901, 'recall': 0.8993288590604027, 'f1-score': 0.875, 'support': 1043.0} | {'precision': 0.9332269074094462, 'recall': 0.9771181556195966, 'f1-score': 0.9546683185043361, 'support': 17350.0} | {'precision': 0.9541427203065134, 'recall': 0.8637546065467158, 'f1-score': 0.90670155876664, 'support': 9226.0} | 0.9363 | {'precision': 0.9131074659749535, 'recall': 0.913400540408905, 'f1-score': 0.9121232924236587, 'support': 27619.0} | {'precision': 0.937144513575063, 'recall': 0.9363119591585503, 'f1-score': 0.9356366598077863, 'support': 27619.0} |
| No log | 9.0 | 369 | 0.2588 | {'precision': 0.8356890459363958, 'recall': 0.9069990412272292, 'f1-score': 0.8698850574712644, 'support': 1043.0} | {'precision': 0.9281042189033032, 'recall': 0.9813832853025937, 'f1-score': 0.9540004482294935, 'support': 17350.0} | {'precision': 0.9629038201695124, 'recall': 0.8496639930630826, 'f1-score': 0.9027465883572292, 'support': 9226.0} | 0.9346 | {'precision': 0.9088990283364038, 'recall': 0.9126821065309684, 'f1-score': 0.9088773646859957, 'support': 27619.0} | {'precision': 0.9362389122621345, 'recall': 0.9345740251276295, 'f1-score': 0.9337028102359982, 'support': 27619.0} |
| No log | 10.0 | 410 | 0.2737 | {'precision': 0.8562091503267973, 'recall': 0.8791946308724832, 'f1-score': 0.8675496688741721, 'support': 1043.0} | {'precision': 0.9356232686980609, 'recall': 0.973371757925072, 'f1-score': 0.9541242937853106, 'support': 17350.0} | {'precision': 0.9457519416333255, 'recall': 0.8711250812920008, 'f1-score': 0.9069058903182126, 'support': 9226.0} | 0.9357 | {'precision': 0.9125281202193946, 'recall': 0.9078971566965187, 'f1-score': 0.9095266176592318, 'support': 27619.0} | {'precision': 0.9360077218295835, 'recall': 0.935660233896955, 'f1-score': 0.9350818112852286, 'support': 27619.0} |
| No log | 11.0 | 451 | 0.2722 | {'precision': 0.8556701030927835, 'recall': 0.87535953978907, 'f1-score': 0.8654028436018957, 'support': 1043.0} | {'precision': 0.9378157792460163, 'recall': 0.9735446685878962, 'f1-score': 0.9553462854557281, 'support': 17350.0} | {'precision': 0.9459079733052336, 'recall': 0.8756774333405593, 'f1-score': 0.9094388473011763, 'support': 9226.0} | 0.9371 | {'precision': 0.9131312852146779, 'recall': 0.9081938805725085, 'f1-score': 0.9100626587862667, 'support': 27619.0} | {'precision': 0.937416801808836, 'recall': 0.9371447192150332, 'f1-score': 0.9366145053671137, 'support': 27619.0} |
| No log | 12.0 | 492 | 0.2749 | {'precision': 0.8612933458294283, 'recall': 0.8811121764141898, 'f1-score': 0.8710900473933649, 'support': 1043.0} | {'precision': 0.9410812921943871, 'recall': 0.9721613832853025, 'f1-score': 0.9563688940549427, 'support': 17350.0} | {'precision': 0.9440259589755475, 'recall': 0.8829395187513549, 'f1-score': 0.9124614953794455, 'support': 9226.0} | 0.9389 | {'precision': 0.9154668656664544, 'recall': 0.9120710261502823, 'f1-score': 0.9133068122759177, 'support': 27619.0} | {'precision': 0.9390518439038746, 'recall': 0.9389188602049314, 'f1-score': 0.938481371072642, 'support': 27619.0} |
| 0.1235 | 13.0 | 533 | 0.2709 | {'precision': 0.8675925925925926, 'recall': 0.8983700862895494, 'f1-score': 0.8827131417804992, 'support': 1043.0} | {'precision': 0.9542141230068337, 'recall': 0.9657636887608069, 'f1-score': 0.959954167860212, 'support': 17350.0} | {'precision': 0.9354048335003898, 'recall': 0.9103620203771949, 'f1-score': 0.9227135402361988, 'support': 9226.0} | 0.9447 | {'precision': 0.9190705163666054, 'recall': 0.9248319318091838, 'f1-score': 0.9217936166256367, 'support': 27619.0} | {'precision': 0.9446598031108019, 'recall': 0.9447119736413339, 'f1-score': 0.944597188220823, 'support': 27619.0} |
| 0.1235 | 14.0 | 574 | 0.2806 | {'precision': 0.8703878902554399, 'recall': 0.8820709491850431, 'f1-score': 0.8761904761904762, 'support': 1043.0} | {'precision': 0.9522888825703725, 'recall': 0.9651873198847263, 'f1-score': 0.9586947187634179, 'support': 17350.0} | {'precision': 0.9327169432995432, 'recall': 0.9075438976804683, 'f1-score': 0.919958248640334, 'support': 9226.0} | 0.9428 | {'precision': 0.9184645720417852, 'recall': 0.918267388916746, 'f1-score': 0.9182811478647427, 'support': 27619.0} | {'precision': 0.942658068757521, 'recall': 0.9427930048155255, 'f1-score': 0.9426393004514172, 'support': 27619.0} |
| 0.1235 | 15.0 | 615 | 0.2848 | {'precision': 0.865546218487395, 'recall': 0.8887823585810163, 'f1-score': 0.8770104068117313, 'support': 1043.0} | {'precision': 0.9523728525259398, 'recall': 0.9681268011527377, 'f1-score': 0.9601852116500414, 'support': 17350.0} | {'precision': 0.9386151947031759, 'recall': 0.9065683936700628, 'f1-score': 0.9223135027843635, 'support': 9226.0} | 0.9446 | {'precision': 0.9188447552388369, 'recall': 0.921159184467939, 'f1-score': 0.9198363737487121, 'support': 27619.0} | {'precision': 0.9444982614699631, 'recall': 0.9445671458054238, 'f1-score': 0.9443933398429122, 'support': 27619.0} |
| 0.1235 | 16.0 | 656 | 0.2879 | {'precision': 0.8652946679139383, 'recall': 0.8868648130393096, 'f1-score': 0.8759469696969697, 'support': 1043.0} | {'precision': 0.9512374695588152, 'recall': 0.9680691642651297, 'f1-score': 0.9595795126688947, 'support': 17350.0} | {'precision': 0.9381536039581694, 'recall': 0.9042922176457836, 'f1-score': 0.9209117500965837, 'support': 9226.0} | 0.9437 | {'precision': 0.9182285804769742, 'recall': 0.9197420649834077, 'f1-score': 0.9188127441541494, 'support': 27619.0} | {'precision': 0.9436213326187679, 'recall': 0.9436981787899634, 'f1-score': 0.9435044368221277, 'support': 27619.0} |
### Framework versions
- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2