Theoreticallyhugo
/

longformer-one-step

+---
+license: apache-2.0
+base_model: allenai/longformer-base-4096
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: longformer-one-step
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# longformer-one-step
+This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.4163
+- B-claim: {'precision': 0.5555555555555556, 'recall': 0.551948051948052, 'f1-score': 0.5537459283387622, 'support': 154.0}
+- B-majorclaim: {'precision': 0.6285714285714286, 'recall': 0.6875, 'f1-score': 0.6567164179104478, 'support': 64.0}
+- B-premise: {'precision': 0.7458677685950413, 'recall': 0.8414918414918415, 'f1-score': 0.7907995618838992, 'support': 429.0}
+- I-claim: {'precision': 0.6674840608141246, 'recall': 0.6067766384306732, 'f1-score': 0.6356842596917328, 'support': 2243.0}
+- I-majorclaim: {'precision': 0.7213656387665198, 'recall': 0.7511467889908257, 'f1-score': 0.7359550561797752, 'support': 872.0}
+- I-premise: {'precision': 0.8961770096884001, 'recall': 0.9113300492610837, 'f1-score': 0.903690012542082, 'support': 7511.0}
+- O: {'precision': 0.9139617607825701, 'recall': 0.910117334514058, 'f1-score': 0.9120354963948973, 'support': 4517.0}
+- Accuracy: 0.8526
+- Macro avg: {'precision': 0.7327118889676628, 'recall': 0.751472957805219, 'f1-score': 0.741232390420228, 'support': 15790.0}
+- Weighted avg: {'precision': 0.8506339314975508, 'recall': 0.8525649145028499, 'f1-score': 0.8512623407634757, 'support': 15790.0}
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 3
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | B-claim                                                                                                             | B-majorclaim                                                                                           | B-premise                                                                                                         | I-claim                                                                                                             | I-majorclaim                                                                                                      | I-premise                                                                                                          | O                                                                                                                  | Accuracy | Macro avg                                                                                                           | Weighted avg                                                                                                        |
+|:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
+| No log        | 1.0   | 196  | 0.4925          | {'precision': 0.4489795918367347, 'recall': 0.14285714285714285, 'f1-score': 0.21674876847290642, 'support': 154.0} | {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 64.0}                                    | {'precision': 0.6521739130434783, 'recall': 0.8741258741258742, 'f1-score': 0.7470119521912351, 'support': 429.0} | {'precision': 0.6343843843843844, 'recall': 0.37672759696834596, 'f1-score': 0.4727272727272727, 'support': 2243.0} | {'precision': 0.6209944751381216, 'recall': 0.6444954128440367, 'f1-score': 0.6325267304445695, 'support': 872.0} | {'precision': 0.845458984375, 'recall': 0.92211423245906, 'f1-score': 0.8821244348213716, 'support': 7511.0}       | {'precision': 0.8657378087397086, 'recall': 0.9079034757582466, 'f1-score': 0.8863194294359196, 'support': 4517.0} | 0.8126   | {'precision': 0.5811041653596325, 'recall': 0.5526033907161009, 'f1-score': 0.5482083697276107, 'support': 15790.0} | {'precision': 0.7963354614345164, 'recall': 0.8126029132362255, 'f1-score': 0.7976491141364899, 'support': 15790.0} |
+| No log        | 2.0   | 392  | 0.4278          | {'precision': 0.5454545454545454, 'recall': 0.5064935064935064, 'f1-score': 0.5252525252525252, 'support': 154.0}   | {'precision': 0.7090909090909091, 'recall': 0.609375, 'f1-score': 0.6554621848739497, 'support': 64.0} | {'precision': 0.6920222634508348, 'recall': 0.8694638694638694, 'f1-score': 0.7706611570247933, 'support': 429.0} | {'precision': 0.6465288818229995, 'recall': 0.5439144003566652, 'f1-score': 0.5907990314769976, 'support': 2243.0}  | {'precision': 0.7232558139534884, 'recall': 0.713302752293578, 'f1-score': 0.7182448036951501, 'support': 872.0}  | {'precision': 0.8695761223977928, 'recall': 0.9231793369724404, 'f1-score': 0.8955763642234421, 'support': 7511.0} | {'precision': 0.9275161588180979, 'recall': 0.8895284480850122, 'f1-score': 0.9081252118883489, 'support': 4517.0} | 0.8413   | {'precision': 0.7304920992840954, 'recall': 0.7221796162378674, 'f1-score': 0.7234458969193153, 'support': 15790.0} | {'precision': 0.8377504411405843, 'recall': 0.8412919569347689, 'f1-score': 0.8381000288341659, 'support': 15790.0} |
+| 0.598         | 3.0   | 588  | 0.4163          | {'precision': 0.5555555555555556, 'recall': 0.551948051948052, 'f1-score': 0.5537459283387622, 'support': 154.0}    | {'precision': 0.6285714285714286, 'recall': 0.6875, 'f1-score': 0.6567164179104478, 'support': 64.0}   | {'precision': 0.7458677685950413, 'recall': 0.8414918414918415, 'f1-score': 0.7907995618838992, 'support': 429.0} | {'precision': 0.6674840608141246, 'recall': 0.6067766384306732, 'f1-score': 0.6356842596917328, 'support': 2243.0}  | {'precision': 0.7213656387665198, 'recall': 0.7511467889908257, 'f1-score': 0.7359550561797752, 'support': 872.0} | {'precision': 0.8961770096884001, 'recall': 0.9113300492610837, 'f1-score': 0.903690012542082, 'support': 7511.0}  | {'precision': 0.9139617607825701, 'recall': 0.910117334514058, 'f1-score': 0.9120354963948973, 'support': 4517.0}  | 0.8526   | {'precision': 0.7327118889676628, 'recall': 0.751472957805219, 'f1-score': 0.741232390420228, 'support': 15790.0}   | {'precision': 0.8506339314975508, 'recall': 0.8525649145028499, 'f1-score': 0.8512623407634757, 'support': 15790.0} |
+### Framework versions
+- Transformers 4.33.0
+- Pytorch 2.0.1+cu118
+- Datasets 2.14.4
+- Tokenizers 0.13.3