|
--- |
|
license: apache-2.0 |
|
base_model: allenai/longformer-base-4096 |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- essays_su_g |
|
metrics: |
|
- accuracy |
|
model-index: |
|
- name: longformer-spans |
|
results: |
|
- task: |
|
name: Token Classification |
|
type: token-classification |
|
dataset: |
|
name: essays_su_g |
|
type: essays_su_g |
|
config: spans |
|
split: train[80%:100%] |
|
args: spans |
|
metrics: |
|
- name: Accuracy |
|
type: accuracy |
|
value: 0.9389550671639089 |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# longformer-spans |
|
|
|
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 0.2009 |
|
- B: {'precision': 0.8471337579617835, 'recall': 0.8926174496644296, 'f1-score': 0.8692810457516339, 'support': 1043.0} |
|
- I: {'precision': 0.9459794744558475, 'recall': 0.9669164265129683, 'f1-score': 0.9563333713373617, 'support': 17350.0} |
|
- O: {'precision': 0.9362622353744594, 'recall': 0.8916106655105137, 'f1-score': 0.9133910726182546, 'support': 9226.0} |
|
- Accuracy: 0.9390 |
|
- Macro avg: {'precision': 0.9097918225973635, 'recall': 0.9170481805626371, 'f1-score': 0.9130018299024169, 'support': 27619.0} |
|
- Weighted avg: {'precision': 0.9390006797830427, 'recall': 0.9389550671639089, 'f1-score': 0.9387012621528006, 'support': 27619.0} |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 8 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 9 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | B | I | O | Accuracy | Macro avg | Weighted avg | |
|
|:-------------:|:-----:|:----:|:---------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:| |
|
| No log | 1.0 | 41 | 0.2937 | {'precision': 0.8082191780821918, 'recall': 0.3959731543624161, 'f1-score': 0.5315315315315315, 'support': 1043.0} | {'precision': 0.8851326600031398, 'recall': 0.9748703170028818, 'f1-score': 0.9278367481280343, 'support': 17350.0} | {'precision': 0.932616577072134, 'recall': 0.8085844352915673, 'f1-score': 0.8661828737300435, 'support': 9226.0} | 0.8975 | {'precision': 0.8753228050524885, 'recall': 0.7264759688856217, 'f1-score': 0.7751837177965365, 'support': 27619.0} | {'precision': 0.8980898944155006, 'recall': 0.8974618921756762, 'f1-score': 0.8922755407669418, 'support': 27619.0} | |
|
| No log | 2.0 | 82 | 0.2221 | {'precision': 0.7776852622814321, 'recall': 0.8954937679769894, 'f1-score': 0.8324420677361853, 'support': 1043.0} | {'precision': 0.9200997398091935, 'recall': 0.978328530259366, 'f1-score': 0.948321135258953, 'support': 17350.0} | {'precision': 0.9626097867001254, 'recall': 0.8315629742033384, 'f1-score': 0.8923005350081413, 'support': 9226.0} | 0.9262 | {'precision': 0.8867982629302503, 'recall': 0.9017950908132312, 'f1-score': 0.8910212460010932, 'support': 27619.0} | {'precision': 0.9289219054398928, 'recall': 0.926174010644846, 'f1-score': 0.9252316705665227, 'support': 27619.0} | |
|
| No log | 3.0 | 123 | 0.1732 | {'precision': 0.8459409594095941, 'recall': 0.8791946308724832, 'f1-score': 0.8622472966619651, 'support': 1043.0} | {'precision': 0.963898493817031, 'recall': 0.9479538904899135, 'f1-score': 0.9558597041815592, 'support': 17350.0} | {'precision': 0.9060388513513513, 'recall': 0.9301972685887708, 'f1-score': 0.9179591400149748, 'support': 9226.0} | 0.9394 | {'precision': 0.9052927681926587, 'recall': 0.9191152633170558, 'f1-score': 0.9120220469528331, 'support': 27619.0} | {'precision': 0.9401162145970984, 'recall': 0.9394257576306166, 'f1-score': 0.9396640292460493, 'support': 27619.0} | |
|
| No log | 4.0 | 164 | 0.1893 | {'precision': 0.8392523364485981, 'recall': 0.8609779482262704, 'f1-score': 0.8499763369616659, 'support': 1043.0} | {'precision': 0.9343029364596582, 'recall': 0.9737752161383285, 'f1-score': 0.9536307961504812, 'support': 17350.0} | {'precision': 0.946491849751949, 'recall': 0.8685237372642532, 'f1-score': 0.9058331449242596, 'support': 9226.0} | 0.9344 | {'precision': 0.9066823742200686, 'recall': 0.9010923005429508, 'f1-score': 0.9031467593454688, 'support': 27619.0} | {'precision': 0.9347851095370013, 'recall': 0.9343567833737645, 'f1-score': 0.9337498181589879, 'support': 27619.0} | |
|
| No log | 5.0 | 205 | 0.1928 | {'precision': 0.8462946020128088, 'recall': 0.8868648130393096, 'f1-score': 0.8661048689138576, 'support': 1043.0} | {'precision': 0.9407601426660722, 'recall': 0.9729682997118155, 'f1-score': 0.9565931886439621, 'support': 17350.0} | {'precision': 0.9475646702400373, 'recall': 0.8814220680685021, 'f1-score': 0.91329739442947, 'support': 9226.0} | 0.9391 | {'precision': 0.9115398049729727, 'recall': 0.9137517269398758, 'f1-score': 0.9119984839957632, 'support': 27619.0} | {'precision': 0.939465780542029, 'recall': 0.9391361019587965, 'f1-score': 0.9387132395183093, 'support': 27619.0} | |
|
| No log | 6.0 | 246 | 0.1784 | {'precision': 0.8283712784588442, 'recall': 0.9069990412272292, 'f1-score': 0.8659038901601832, 'support': 1043.0} | {'precision': 0.9433644229688729, 'recall': 0.9677233429394813, 'f1-score': 0.9553886423125071, 'support': 17350.0} | {'precision': 0.9398548219840995, 'recall': 0.8841318014307392, 'f1-score': 0.9111421390672997, 'support': 9226.0} | 0.9375 | {'precision': 0.9038635078039389, 'recall': 0.9196180618658166, 'f1-score': 0.9108115571799966, 'support': 27619.0} | {'precision': 0.9378494720868902, 'recall': 0.9375067888048083, 'f1-score': 0.9372290117887677, 'support': 27619.0} | |
|
| No log | 7.0 | 287 | 0.1897 | {'precision': 0.8537037037037037, 'recall': 0.8839884947267498, 'f1-score': 0.8685821950070655, 'support': 1043.0} | {'precision': 0.9477176070314715, 'recall': 0.96328530259366, 'f1-score': 0.9554380448763755, 'support': 17350.0} | {'precision': 0.9293575920934412, 'recall': 0.8969217429004986, 'f1-score': 0.9128516271373415, 'support': 9226.0} | 0.9381 | {'precision': 0.9102596342762054, 'recall': 0.9147318467403028, 'f1-score': 0.9122906223402608, 'support': 27619.0} | {'precision': 0.9380342007173714, 'recall': 0.9381223071074261, 'f1-score': 0.9379322357785074, 'support': 27619.0} | |
|
| No log | 8.0 | 328 | 0.1994 | {'precision': 0.8458029197080292, 'recall': 0.8887823585810163, 'f1-score': 0.8667601683029453, 'support': 1043.0} | {'precision': 0.941661062542031, 'recall': 0.9684726224783862, 'f1-score': 0.9548786725009946, 'support': 17350.0} | {'precision': 0.938241732918539, 'recall': 0.8826143507478864, 'f1-score': 0.9095783300753979, 'support': 9226.0} | 0.9368 | {'precision': 0.9085685717228663, 'recall': 0.9132897772690963, 'f1-score': 0.9104057236264459, 'support': 27619.0} | {'precision': 0.9368988778835639, 'recall': 0.9367826496252579, 'f1-score': 0.9364186066370198, 'support': 27619.0} | |
|
| No log | 9.0 | 369 | 0.2009 | {'precision': 0.8471337579617835, 'recall': 0.8926174496644296, 'f1-score': 0.8692810457516339, 'support': 1043.0} | {'precision': 0.9459794744558475, 'recall': 0.9669164265129683, 'f1-score': 0.9563333713373617, 'support': 17350.0} | {'precision': 0.9362622353744594, 'recall': 0.8916106655105137, 'f1-score': 0.9133910726182546, 'support': 9226.0} | 0.9390 | {'precision': 0.9097918225973635, 'recall': 0.9170481805626371, 'f1-score': 0.9130018299024169, 'support': 27619.0} | {'precision': 0.9390006797830427, 'recall': 0.9389550671639089, 'f1-score': 0.9387012621528006, 'support': 27619.0} | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.37.2 |
|
- Pytorch 2.2.0+cu121 |
|
- Datasets 2.17.0 |
|
- Tokenizers 0.15.2 |
|
|