longformer-spans / README.md
Theoreticallyhugo's picture
trainer: training complete at 2024-03-02 12:20:23.213790.
2df5d64 verified
|
raw
history blame
13.9 kB
---
license: apache-2.0
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-spans
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: essays_su_g
type: essays_su_g
config: spans
split: train[40%:60%]
args: spans
metrics:
- name: Accuracy
type: accuracy
value: 0.9435675748131765
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# longformer-spans
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
It achieves the following results on the evaluation set:
- Loss: 0.3010
- B: {'precision': 0.8761061946902655, 'recall': 0.9173745173745174, 'f1-score': 0.8962655601659751, 'support': 1295.0}
- I: {'precision': 0.9587562509283557, 'recall': 0.9650635434836781, 'f1-score': 0.9618995578957825, 'support': 20065.0}
- O: {'precision': 0.9175916988416989, 'recall': 0.8967102935974531, 'f1-score': 0.907030830699505, 'support': 8481.0}
- Accuracy: 0.9436
- Macro avg: {'precision': 0.9174847148201067, 'recall': 0.9263827848185495, 'f1-score': 0.9217319829204209, 'support': 29841.0}
- Weighted avg: {'precision': 0.943470289027774, 'recall': 0.9435675748131765, 'f1-score': 0.9434572234427906, 'support': 29841.0}
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 16
### Training results
| Training Loss | Epoch | Step | Validation Loss | B | I | O | Accuracy | Macro avg | Weighted avg |
|:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log | 1.0 | 41 | 0.3153 | {'precision': 0.8472222222222222, 'recall': 0.47104247104247104, 'f1-score': 0.6054590570719602, 'support': 1295.0} | {'precision': 0.8894522863277146, 'recall': 0.9703962123099925, 'f1-score': 0.9281628372580799, 'support': 20065.0} | {'precision': 0.8983402489626556, 'recall': 0.7658295012380616, 'f1-score': 0.826809241932404, 'support': 8481.0} | 0.8906 | {'precision': 0.8783382525041974, 'recall': 0.735756061530175, 'f1-score': 0.786810378754148, 'support': 29841.0} | {'precision': 0.8901456571293072, 'recall': 0.8905867765825543, 'f1-score': 0.8853532384745914, 'support': 29841.0} |
| No log | 2.0 | 82 | 0.2253 | {'precision': 0.7966329966329966, 'recall': 0.9135135135135135, 'f1-score': 0.8510791366906474, 'support': 1295.0} | {'precision': 0.9247806497510078, 'recall': 0.9717916770495888, 'f1-score': 0.9477035236938031, 'support': 20065.0} | {'precision': 0.9339843212763032, 'recall': 0.8007310458672326, 'f1-score': 0.8622397155916709, 'support': 8481.0} | 0.9206 | {'precision': 0.8851326558867693, 'recall': 0.895345412143445, 'f1-score': 0.8870074586587071, 'support': 29841.0} | {'precision': 0.9218352098333846, 'recall': 0.9206460909486948, 'f1-score': 0.9192209950358067, 'support': 29841.0} |
| No log | 3.0 | 123 | 0.1809 | {'precision': 0.8226027397260274, 'recall': 0.9274131274131274, 'f1-score': 0.8718693284936478, 'support': 1295.0} | {'precision': 0.9520828198175992, 'recall': 0.9625218041365562, 'f1-score': 0.95727385377943, 'support': 20065.0} | {'precision': 0.91600790513834, 'recall': 0.8744251857092324, 'f1-score': 0.8947336671291549, 'support': 8481.0} | 0.9360 | {'precision': 0.8968978215606556, 'recall': 0.9214533724196388, 'f1-score': 0.9079589498007442, 'support': 29841.0} | {'precision': 0.9362110978540797, 'recall': 0.9359605911330049, 'f1-score': 0.9357932672298482, 'support': 29841.0} |
| No log | 4.0 | 164 | 0.1962 | {'precision': 0.8513513513513513, 'recall': 0.9243243243243243, 'f1-score': 0.8863383931877082, 'support': 1295.0} | {'precision': 0.942660770931462, 'recall': 0.9774732120608024, 'f1-score': 0.9597514129823103, 'support': 20065.0} | {'precision': 0.9454712282081531, 'recall': 0.8504893290885509, 'f1-score': 0.8954686530105526, 'support': 8481.0} | 0.9391 | {'precision': 0.9131611168303221, 'recall': 0.9174289551578925, 'f1-score': 0.913852819726857, 'support': 29841.0} | {'precision': 0.9394969959174669, 'recall': 0.9390771086759827, 'f1-score': 0.9382959675228925, 'support': 29841.0} |
| No log | 5.0 | 205 | 0.1936 | {'precision': 0.8609467455621301, 'recall': 0.8988416988416988, 'f1-score': 0.8794862108046846, 'support': 1295.0} | {'precision': 0.9656717938270347, 'recall': 0.9449289808123599, 'f1-score': 0.955187788105494, 'support': 20065.0} | {'precision': 0.8764539808018069, 'recall': 0.9151043509020163, 'f1-score': 0.8953622519612366, 'support': 8481.0} | 0.9345 | {'precision': 0.9010241733969906, 'recall': 0.9196250101853582, 'f1-score': 0.9100120836238051, 'support': 29841.0} | {'precision': 0.9357708116290517, 'recall': 0.9344525987734995, 'f1-score': 0.9348997979361299, 'support': 29841.0} |
| No log | 6.0 | 246 | 0.1947 | {'precision': 0.8310533515731874, 'recall': 0.9382239382239382, 'f1-score': 0.8813928182807399, 'support': 1295.0} | {'precision': 0.958739197762126, 'recall': 0.9565412409668577, 'f1-score': 0.9576389581878055, 'support': 20065.0} | {'precision': 0.9059808612440191, 'recall': 0.8930550642612899, 'f1-score': 0.8994715278190131, 'support': 8481.0} | 0.9377 | {'precision': 0.8985911368597775, 'recall': 0.9292734144840287, 'f1-score': 0.9128344347625195, 'support': 29841.0} | {'precision': 0.9382038060921168, 'recall': 0.9377031600817667, 'f1-score': 0.9377985799116962, 'support': 29841.0} |
| No log | 7.0 | 287 | 0.2014 | {'precision': 0.8799403430275914, 'recall': 0.9111969111969112, 'f1-score': 0.8952959028831563, 'support': 1295.0} | {'precision': 0.9675979919882359, 'recall': 0.9510092200348866, 'f1-score': 0.9592318906147891, 'support': 20065.0} | {'precision': 0.8888256065611118, 'recall': 0.9200565970993987, 'f1-score': 0.9041714947856316, 'support': 8481.0} | 0.9405 | {'precision': 0.9121213138589797, 'recall': 0.9274209094437321, 'f1-score': 0.919566429427859, 'support': 29841.0} | {'precision': 0.9414063343289258, 'recall': 0.940484568211521, 'f1-score': 0.9408087707079646, 'support': 29841.0} |
| No log | 8.0 | 328 | 0.2169 | {'precision': 0.8607322325915291, 'recall': 0.9258687258687258, 'f1-score': 0.8921130952380952, 'support': 1295.0} | {'precision': 0.9490554125588849, 'recall': 0.9739347121853975, 'f1-score': 0.9613341204250295, 'support': 20065.0} | {'precision': 0.9371261295659921, 'recall': 0.8681759226506308, 'f1-score': 0.9013343126453666, 'support': 8481.0} | 0.9418 | {'precision': 0.9156379249054686, 'recall': 0.9226597869015847, 'f1-score': 0.9182605094361639, 'support': 29841.0} | {'precision': 0.9418321034499257, 'recall': 0.9417914949230924, 'f1-score': 0.9412778355352336, 'support': 29841.0} |
| No log | 9.0 | 369 | 0.2356 | {'precision': 0.8841554559043349, 'recall': 0.9135135135135135, 'f1-score': 0.8985947588302315, 'support': 1295.0} | {'precision': 0.958962427602594, 'recall': 0.9654622476949912, 'f1-score': 0.9622013609496847, 'support': 20065.0} | {'precision': 0.9177306673090821, 'recall': 0.8983610423299139, 'f1-score': 0.9079425609247452, 'support': 8481.0} | 0.9441 | {'precision': 0.9202828502720036, 'recall': 0.9257789345128061, 'f1-score': 0.9229128935682205, 'support': 29841.0} | {'precision': 0.9439977284504704, 'recall': 0.9441372608156563, 'f1-score': 0.944020353853535, 'support': 29841.0} |
| No log | 10.0 | 410 | 0.2491 | {'precision': 0.846045197740113, 'recall': 0.9250965250965251, 'f1-score': 0.883806713389893, 'support': 1295.0} | {'precision': 0.9549009000147544, 'recall': 0.9676551208572141, 'f1-score': 0.9612357047378584, 'support': 20065.0} | {'precision': 0.9259762728620861, 'recall': 0.8835043037377668, 'f1-score': 0.904241839135944, 'support': 8481.0} | 0.9419 | {'precision': 0.9089741235389845, 'recall': 0.9254186498971686, 'f1-score': 0.9164280857545651, 'support': 29841.0} | {'precision': 0.941956364063297, 'recall': 0.9418920277470594, 'f1-score': 0.9416775291416837, 'support': 29841.0} |
| No log | 11.0 | 451 | 0.2823 | {'precision': 0.8699127906976745, 'recall': 0.9243243243243243, 'f1-score': 0.8962935230250841, 'support': 1295.0} | {'precision': 0.9454922579711543, 'recall': 0.9768751557438325, 'f1-score': 0.9609275419158742, 'support': 20065.0} | {'precision': 0.9427204551331781, 'recall': 0.8596863577408325, 'f1-score': 0.8992907801418439, 'support': 8481.0} | 0.9413 | {'precision': 0.9193751679340023, 'recall': 0.9202952792696631, 'f1-score': 0.9188372816942675, 'support': 29841.0} | {'precision': 0.9414245970352596, 'recall': 0.9412888308032573, 'f1-score': 0.9406050851929385, 'support': 29841.0} |
| No log | 12.0 | 492 | 0.2666 | {'precision': 0.8749080206033848, 'recall': 0.9181467181467181, 'f1-score': 0.896006028636021, 'support': 1295.0} | {'precision': 0.9618267212950934, 'recall': 0.9593820084724645, 'f1-score': 0.9606028094513335, 'support': 20065.0} | {'precision': 0.9059990552668871, 'recall': 0.9046103053885155, 'f1-score': 0.9053041477373296, 'support': 8481.0} | 0.9420 | {'precision': 0.9142445990551217, 'recall': 0.9273796773358992, 'f1-score': 0.9206376619415613, 'support': 29841.0} | {'precision': 0.9421881651816595, 'recall': 0.9420260715123487, 'f1-score': 0.9420832966618057, 'support': 29841.0} |
| 0.1288 | 13.0 | 533 | 0.2789 | {'precision': 0.8708971553610503, 'recall': 0.922007722007722, 'f1-score': 0.8957239309827456, 'support': 1295.0} | {'precision': 0.960913024019096, 'recall': 0.9630201844006977, 'f1-score': 0.961965450291233, 'support': 20065.0} | {'precision': 0.9144839134074871, 'recall': 0.9015446291710884, 'f1-score': 0.9079681748010924, 'support': 8481.0} | 0.9438 | {'precision': 0.9154313642625445, 'recall': 0.928857511859836, 'f1-score': 0.9218858520250238, 'support': 29841.0} | {'precision': 0.9438111897303917, 'recall': 0.9437686404611105, 'f1-score': 0.9437444234846122, 'support': 29841.0} |
| 0.1288 | 14.0 | 574 | 0.2878 | {'precision': 0.8693759071117562, 'recall': 0.9250965250965251, 'f1-score': 0.8963711185933408, 'support': 1295.0} | {'precision': 0.9577367433593365, 'recall': 0.9667580363817593, 'f1-score': 0.9622262456906173, 'support': 20065.0} | {'precision': 0.9222804239249605, 'recall': 0.8927013323900483, 'f1-score': 0.9072498502097065, 'support': 8481.0} | 0.9439 | {'precision': 0.9164643581320178, 'recall': 0.9281852979561108, 'f1-score': 0.9219490714978882, 'support': 29841.0} | {'precision': 0.9438252682725914, 'recall': 0.9439026842263999, 'f1-score': 0.943743714955569, 'support': 29841.0} |
| 0.1288 | 15.0 | 615 | 0.3028 | {'precision': 0.8794642857142857, 'recall': 0.9127413127413128, 'f1-score': 0.8957938613111027, 'support': 1295.0} | {'precision': 0.9580749193748449, 'recall': 0.9623722900573137, 'f1-score': 0.9602187966185977, 'support': 20065.0} | {'precision': 0.9105730040757612, 'recall': 0.8956490979837284, 'f1-score': 0.9030493966593355, 'support': 8481.0} | 0.9413 | {'precision': 0.916037403054964, 'recall': 0.9235875669274516, 'f1-score': 0.9196873515296785, 'support': 29841.0} | {'precision': 0.9411631364506148, 'recall': 0.9412553198619349, 'f1-score': 0.941175065769172, 'support': 29841.0} |
| 0.1288 | 16.0 | 656 | 0.3010 | {'precision': 0.8761061946902655, 'recall': 0.9173745173745174, 'f1-score': 0.8962655601659751, 'support': 1295.0} | {'precision': 0.9587562509283557, 'recall': 0.9650635434836781, 'f1-score': 0.9618995578957825, 'support': 20065.0} | {'precision': 0.9175916988416989, 'recall': 0.8967102935974531, 'f1-score': 0.907030830699505, 'support': 8481.0} | 0.9436 | {'precision': 0.9174847148201067, 'recall': 0.9263827848185495, 'f1-score': 0.9217319829204209, 'support': 29841.0} | {'precision': 0.943470289027774, 'recall': 0.9435675748131765, 'f1-score': 0.9434572234427906, 'support': 29841.0} |
### Framework versions
- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2