File size: 7,643 Bytes
a5be778
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55a8b23
a5be778
 
 
 
55a8b23
a5be778
 
 
 
 
 
 
 
 
55a8b23
 
 
 
 
 
 
 
a5be778
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55a8b23
 
 
 
 
 
 
a5be778
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
---
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-simple
  results:
  - task:
      name: Token Classification
      type: token-classification
    dataset:
      name: essays_su_g
      type: essays_su_g
      config: simple
      split: train[80%:100%]
      args: simple
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.8379014446576633
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# longformer-simple

This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4267
- Claim: {'precision': 0.6011011011011012, 'recall': 0.5762955854126679, 'f1-score': 0.58843704066634, 'support': 4168.0}
- Majorclaim: {'precision': 0.7353560893383903, 'recall': 0.8108736059479554, 'f1-score': 0.7712707182320443, 'support': 2152.0}
- O: {'precision': 0.9331677579589072, 'recall': 0.8959462388900932, 'f1-score': 0.9141782791417828, 'support': 9226.0}
- Premise: {'precision': 0.8658005164622337, 'recall': 0.8886772136171622, 'f1-score': 0.8770897200081749, 'support': 12073.0}
- Accuracy: 0.8379
- Macro avg: {'precision': 0.7838563662151581, 'recall': 0.7929481609669696, 'f1-score': 0.7877439395120855, 'support': 27619.0}
- Weighted avg: {'precision': 0.838194397473588, 'recall': 0.8379014446576633, 'f1-score': 0.8376730933108891, 'support': 27619.0}

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5

### Training results

| Training Loss | Epoch | Step | Validation Loss | Claim                                                                                                                | Majorclaim                                                                                                         | O                                                                                                                  | Premise                                                                                                             | Accuracy | Macro avg                                                                                                           | Weighted avg                                                                                                        |
|:-------------:|:-----:|:----:|:---------------:|:--------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log        | 1.0   | 41   | 0.6166          | {'precision': 0.4200196270853778, 'recall': 0.2053742802303263, 'f1-score': 0.27586206896551724, 'support': 4168.0}  | {'precision': 0.6073394495412844, 'recall': 0.46143122676579923, 'f1-score': 0.524425666754687, 'support': 2152.0} | {'precision': 0.897315672254132, 'recall': 0.8297203555170172, 'f1-score': 0.8621951906290477, 'support': 9226.0}  | {'precision': 0.7481024975673046, 'recall': 0.9551892653027416, 'f1-score': 0.8390570430733411, 'support': 12073.0} | 0.7616   | {'precision': 0.6681943116120247, 'recall': 0.612928781953971, 'f1-score': 0.6253849923556483, 'support': 27619.0}  | {'precision': 0.7374674009360002, 'recall': 0.7616495890510157, 'f1-score': 0.7372788894627758, 'support': 27619.0} |
| No log        | 2.0   | 82   | 0.4575          | {'precision': 0.5743048897411314, 'recall': 0.43114203454894434, 'f1-score': 0.49253117719610806, 'support': 4168.0} | {'precision': 0.7058560572194904, 'recall': 0.7337360594795539, 'f1-score': 0.7195260879471406, 'support': 2152.0} | {'precision': 0.9206993795826283, 'recall': 0.8846737481031867, 'f1-score': 0.9023271239843015, 'support': 9226.0} | {'precision': 0.8243949805796236, 'recall': 0.9141886854965626, 'f1-score': 0.8669730175562625, 'support': 12073.0} | 0.8174   | {'precision': 0.7563138267807185, 'recall': 0.7409351319070618, 'f1-score': 0.7453393516709531, 'support': 27619.0} | {'precision': 0.8095875336596003, 'recall': 0.8173720989174119, 'f1-score': 0.8107869718183696, 'support': 27619.0} |
| No log        | 3.0   | 123  | 0.4417          | {'precision': 0.6082102988836874, 'recall': 0.4052303262955854, 'f1-score': 0.4863930885529157, 'support': 4168.0}   | {'precision': 0.7309513560051657, 'recall': 0.7890334572490706, 'f1-score': 0.7588826815642457, 'support': 2152.0} | {'precision': 0.9306548632391329, 'recall': 0.8887925428137872, 'f1-score': 0.9092421134334979, 'support': 9226.0} | {'precision': 0.8175517945725124, 'recall': 0.9282696927027251, 'f1-score': 0.8693999456964432, 'support': 12073.0} | 0.8253   | {'precision': 0.7718420781751247, 'recall': 0.7528315047652921, 'f1-score': 0.7559794573117755, 'support': 27619.0} | {'precision': 0.8169938241061772, 'recall': 0.8253014229334878, 'f1-score': 0.8162980269649668, 'support': 27619.0} |
| No log        | 4.0   | 164  | 0.4247          | {'precision': 0.5918674698795181, 'recall': 0.5657389635316699, 'f1-score': 0.5785083415112856, 'support': 4168.0}   | {'precision': 0.7616387337057728, 'recall': 0.7602230483271375, 'f1-score': 0.7609302325581395, 'support': 2152.0} | {'precision': 0.918848167539267, 'recall': 0.9130717537394321, 'f1-score': 0.9159508535391975, 'support': 9226.0}  | {'precision': 0.8669534864842926, 'recall': 0.8846185703636213, 'f1-score': 0.8756969498196131, 'support': 12073.0} | 0.8363   | {'precision': 0.7848269644022126, 'recall': 0.7809130839904652, 'f1-score': 0.782771594357059, 'support': 27619.0}  | {'precision': 0.8345694197992249, 'recall': 0.8363083384626525, 'f1-score': 0.8353523472178203, 'support': 27619.0} |
| No log        | 5.0   | 205  | 0.4267          | {'precision': 0.6011011011011012, 'recall': 0.5762955854126679, 'f1-score': 0.58843704066634, 'support': 4168.0}     | {'precision': 0.7353560893383903, 'recall': 0.8108736059479554, 'f1-score': 0.7712707182320443, 'support': 2152.0} | {'precision': 0.9331677579589072, 'recall': 0.8959462388900932, 'f1-score': 0.9141782791417828, 'support': 9226.0} | {'precision': 0.8658005164622337, 'recall': 0.8886772136171622, 'f1-score': 0.8770897200081749, 'support': 12073.0} | 0.8379   | {'precision': 0.7838563662151581, 'recall': 0.7929481609669696, 'f1-score': 0.7877439395120855, 'support': 27619.0} | {'precision': 0.838194397473588, 'recall': 0.8379014446576633, 'f1-score': 0.8376730933108891, 'support': 27619.0}  |


### Framework versions

- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2