File size: 11,500 Bytes
55a8b23
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
---
license: apache-2.0
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-simple
  results:
  - task:
      name: Token Classification
      type: token-classification
    dataset:
      name: essays_su_g
      type: essays_su_g
      config: simple
      split: train[80%:100%]
      args: simple
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.8427169702016728
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# longformer-simple

This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
It achieves the following results on the evaluation set:
- Loss: 0.5229
- Claim: {'precision': 0.5976262508727019, 'recall': 0.6161228406909789, 'f1-score': 0.6067336089781453, 'support': 4168.0}
- Majorclaim: {'precision': 0.7766384306732055, 'recall': 0.8094795539033457, 'f1-score': 0.7927189988623435, 'support': 2152.0}
- O: {'precision': 0.9332209106239461, 'recall': 0.8997398655972252, 'f1-score': 0.9161746040505491, 'support': 9226.0}
- Premise: {'precision': 0.8752462245567958, 'recall': 0.8832932990971589, 'f1-score': 0.8792513501257369, 'support': 12073.0}
- Accuracy: 0.8427
- Macro avg: {'precision': 0.7956829541816623, 'recall': 0.8021588898221772, 'f1-score': 0.7987196405041936, 'support': 27619.0}
- Weighted avg: {'precision': 0.8450333432396857, 'recall': 0.8427169702016728, 'f1-score': 0.8437172024624736, 'support': 27619.0}

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10

### Training results

| Training Loss | Epoch | Step | Validation Loss | Claim                                                                                                               | Majorclaim                                                                                                         | O                                                                                                                  | Premise                                                                                                             | Accuracy | Macro avg                                                                                                           | Weighted avg                                                                                                        |
|:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log        | 1.0   | 41   | 0.5686          | {'precision': 0.5015495867768595, 'recall': 0.2329654510556622, 'f1-score': 0.31815203145478377, 'support': 4168.0} | {'precision': 0.5294117647058824, 'recall': 0.6565985130111525, 'f1-score': 0.5861854387056629, 'support': 2152.0} | {'precision': 0.9169171576289528, 'recall': 0.826577064816822, 'f1-score': 0.8694066009234452, 'support': 9226.0}  | {'precision': 0.778798394230115, 'recall': 0.9480659322455065, 'f1-score': 0.8551363466567052, 'support': 12073.0}  | 0.7769   | {'precision': 0.6816692258354524, 'recall': 0.6660517402822859, 'f1-score': 0.6572201044351493, 'support': 27619.0} | {'precision': 0.7636649952988127, 'recall': 0.7768565118215721, 'f1-score': 0.7579106826642613, 'support': 27619.0} |
| No log        | 2.0   | 82   | 0.4450          | {'precision': 0.5915697674418605, 'recall': 0.4882437619961612, 'f1-score': 0.5349631966351209, 'support': 4168.0}  | {'precision': 0.7189862160960427, 'recall': 0.7513940520446096, 'f1-score': 0.7348329925017042, 'support': 2152.0} | {'precision': 0.9161246916348957, 'recall': 0.8855408627791025, 'f1-score': 0.900573192239859, 'support': 9226.0}  | {'precision': 0.8421457116507839, 'recall': 0.9076451586184047, 'f1-score': 0.873669523619693, 'support': 12073.0}  | 0.8248   | {'precision': 0.7672065967058957, 'recall': 0.7582059588595695, 'f1-score': 0.7610097262490942, 'support': 27619.0} | {'precision': 0.8194472178398864, 'recall': 0.8247945255078026, 'f1-score': 0.8207244155727703, 'support': 27619.0} |
| No log        | 3.0   | 123  | 0.4291          | {'precision': 0.5668412662263721, 'recall': 0.597168905950096, 'f1-score': 0.5816100011683608, 'support': 4168.0}   | {'precision': 0.7096088435374149, 'recall': 0.7755576208178439, 'f1-score': 0.7411190053285968, 'support': 2152.0} | {'precision': 0.9522893882946761, 'recall': 0.858877086494689, 'f1-score': 0.9031743317946088, 'support': 9226.0}  | {'precision': 0.8618876941457586, 'recall': 0.8962975233993208, 'f1-score': 0.878755887607601, 'support': 12073.0}  | 0.8292   | {'precision': 0.7726567980510554, 'recall': 0.7819752841654874, 'f1-score': 0.7761648064747918, 'support': 27619.0} | {'precision': 0.8356951611844187, 'recall': 0.829247981462037, 'f1-score': 0.8313459864788912, 'support': 27619.0}  |
| No log        | 4.0   | 164  | 0.4224          | {'precision': 0.635280095351609, 'recall': 0.5115163147792706, 'f1-score': 0.5667198298777245, 'support': 4168.0}   | {'precision': 0.7965432098765433, 'recall': 0.7495353159851301, 'f1-score': 0.7723246349054346, 'support': 2152.0} | {'precision': 0.9076593465452598, 'recall': 0.9183828311294169, 'f1-score': 0.9129896018533484, 'support': 9226.0} | {'precision': 0.8526699217236302, 'recall': 0.9112896546011762, 'f1-score': 0.8810057655349135, 'support': 12073.0} | 0.8407   | {'precision': 0.7980381433742605, 'recall': 0.7726810291237485, 'f1-score': 0.7832599580428552, 'support': 27619.0} | {'precision': 0.8338592100103474, 'recall': 0.8407255874579094, 'f1-score': 0.8357925898565789, 'support': 27619.0} |
| No log        | 5.0   | 205  | 0.4366          | {'precision': 0.5841626085115392, 'recall': 0.6619481765834933, 'f1-score': 0.6206276009447756, 'support': 4168.0}  | {'precision': 0.7325534489713594, 'recall': 0.8438661710037175, 'f1-score': 0.7842798531634635, 'support': 2152.0} | {'precision': 0.9277765412864456, 'recall': 0.9036418816388467, 'f1-score': 0.9155501866900944, 'support': 9226.0} | {'precision': 0.8999212667308197, 'recall': 0.8520665948811398, 'f1-score': 0.8753403675970047, 'support': 12073.0} | 0.8400   | {'precision': 0.786103466375041, 'recall': 0.8153807060267992, 'f1-score': 0.7989495020988346, 'support': 27619.0}  | {'precision': 0.848534001868728, 'recall': 0.8399652413193816, 'f1-score': 0.8432382188039772, 'support': 27619.0}  |
| No log        | 6.0   | 246  | 0.4462          | {'precision': 0.6034715960324617, 'recall': 0.642274472168906, 'f1-score': 0.6222687122268712, 'support': 4168.0}   | {'precision': 0.7745405647691618, 'recall': 0.8029739776951673, 'f1-score': 0.7885010266940452, 'support': 2152.0} | {'precision': 0.9244103126714207, 'recall': 0.913288532408411, 'f1-score': 0.9188157679515838, 'support': 9226.0}  | {'precision': 0.8895835093351356, 'recall': 0.8721941522405368, 'f1-score': 0.8808030112923464, 'support': 12073.0} | 0.8458   | {'precision': 0.7980014957020449, 'recall': 0.8076827836282554, 'f1-score': 0.8025971295412117, 'support': 27619.0} | {'precision': 0.8490760766340619, 'recall': 0.8458307686737391, 'f1-score': 0.8472935020261775, 'support': 27619.0} |
| No log        | 7.0   | 287  | 0.4800          | {'precision': 0.6023660067600193, 'recall': 0.5986084452975048, 'f1-score': 0.6004813477737665, 'support': 4168.0}  | {'precision': 0.7868324125230203, 'recall': 0.7941449814126395, 'f1-score': 0.7904717853839038, 'support': 2152.0} | {'precision': 0.9340485601355166, 'recall': 0.8964881855625406, 'f1-score': 0.9148830263812843, 'support': 9226.0} | {'precision': 0.8673895582329317, 'recall': 0.894475275407935, 'f1-score': 0.8807242180809852, 'support': 12073.0}  | 0.8427   | {'precision': 0.797659134412872, 'recall': 0.7959292219201549, 'f1-score': 0.796640094404985, 'support': 27619.0}   | {'precision': 0.8433850255361078, 'recall': 0.8426807632426953, 'f1-score': 0.8428109571654543, 'support': 27619.0} |
| No log        | 8.0   | 328  | 0.5120          | {'precision': 0.5682210708117443, 'recall': 0.6314779270633397, 'f1-score': 0.5981818181818181, 'support': 4168.0}  | {'precision': 0.7462057335581788, 'recall': 0.8224907063197026, 'f1-score': 0.7824933687002653, 'support': 2152.0} | {'precision': 0.9381360777587193, 'recall': 0.8892261001517451, 'f1-score': 0.9130265427633409, 'support': 9226.0} | {'precision': 0.8802864363942713, 'recall': 0.865484966454071, 'f1-score': 0.8728229545169779, 'support': 12073.0}  | 0.8348   | {'precision': 0.7832123296307284, 'recall': 0.8021699249972145, 'f1-score': 0.7916311710406005, 'support': 27619.0} | {'precision': 0.8420696535627841, 'recall': 0.8347514392266193, 'f1-score': 0.8377682740520238, 'support': 27619.0} |
| No log        | 9.0   | 369  | 0.5199          | {'precision': 0.6004990925589837, 'recall': 0.6350767754318618, 'f1-score': 0.617304104477612, 'support': 4168.0}   | {'precision': 0.7618432385874246, 'recall': 0.8220260223048327, 'f1-score': 0.7907912382655341, 'support': 2152.0} | {'precision': 0.9352794749943426, 'recall': 0.8959462388900932, 'f1-score': 0.9151904340124003, 'support': 9226.0} | {'precision': 0.88083976433491, 'recall': 0.879234655843618, 'f1-score': 0.8800364781959874, 'support': 12073.0}    | 0.8435   | {'precision': 0.7946153926189152, 'recall': 0.8080709231176013, 'f1-score': 0.8008305637378834, 'support': 27619.0} | {'precision': 0.8474468220550765, 'recall': 0.8435135232991781, 'f1-score': 0.8451766391856577, 'support': 27619.0} |
| No log        | 10.0  | 410  | 0.5229          | {'precision': 0.5976262508727019, 'recall': 0.6161228406909789, 'f1-score': 0.6067336089781453, 'support': 4168.0}  | {'precision': 0.7766384306732055, 'recall': 0.8094795539033457, 'f1-score': 0.7927189988623435, 'support': 2152.0} | {'precision': 0.9332209106239461, 'recall': 0.8997398655972252, 'f1-score': 0.9161746040505491, 'support': 9226.0} | {'precision': 0.8752462245567958, 'recall': 0.8832932990971589, 'f1-score': 0.8792513501257369, 'support': 12073.0} | 0.8427   | {'precision': 0.7956829541816623, 'recall': 0.8021588898221772, 'f1-score': 0.7987196405041936, 'support': 27619.0} | {'precision': 0.8450333432396857, 'recall': 0.8427169702016728, 'f1-score': 0.8437172024624736, 'support': 27619.0} |


### Framework versions

- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2