galkowskim
commited on
Commit
•
4b6111e
1
Parent(s):
e6ba9cb
Training complete
Browse files- README.md +53 -0
- test_metrics.json +4 -0
- train_losses.csv +111 -0
README.md
ADDED
@@ -0,0 +1,53 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
base_model: allenai/longformer-base-4096
|
4 |
+
tags:
|
5 |
+
- generated_from_trainer
|
6 |
+
model-index:
|
7 |
+
- name: longformer_base_4096_QA_SQUAD_adafactor
|
8 |
+
results: []
|
9 |
+
---
|
10 |
+
|
11 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
12 |
+
should probably proofread and complete it, then remove this comment. -->
|
13 |
+
|
14 |
+
# longformer_base_4096_QA_SQUAD_adafactor
|
15 |
+
|
16 |
+
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on an unknown dataset.
|
17 |
+
|
18 |
+
## Model description
|
19 |
+
|
20 |
+
More information needed
|
21 |
+
|
22 |
+
## Intended uses & limitations
|
23 |
+
|
24 |
+
More information needed
|
25 |
+
|
26 |
+
## Training and evaluation data
|
27 |
+
|
28 |
+
More information needed
|
29 |
+
|
30 |
+
## Training procedure
|
31 |
+
|
32 |
+
### Training hyperparameters
|
33 |
+
|
34 |
+
The following hyperparameters were used during training:
|
35 |
+
- learning_rate: 2e-05
|
36 |
+
- train_batch_size: 8
|
37 |
+
- eval_batch_size: 8
|
38 |
+
- seed: 42
|
39 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
40 |
+
- lr_scheduler_type: linear
|
41 |
+
- num_epochs: 5
|
42 |
+
- mixed_precision_training: Native AMP
|
43 |
+
|
44 |
+
### Training results
|
45 |
+
|
46 |
+
|
47 |
+
|
48 |
+
### Framework versions
|
49 |
+
|
50 |
+
- Transformers 4.40.0
|
51 |
+
- Pytorch 2.2.1
|
52 |
+
- Datasets 2.19.0
|
53 |
+
- Tokenizers 0.19.1
|
test_metrics.json
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"exact_match": 84.76821192052981,
|
3 |
+
"f1": 91.83286266332226
|
4 |
+
}
|
train_losses.csv
ADDED
@@ -0,0 +1,111 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
loss,epoch
|
2 |
+
1.9365,0.045662100456621
|
3 |
+
1.2435,0.091324200913242
|
4 |
+
1.1523,0.136986301369863
|
5 |
+
1.0821,0.182648401826484
|
6 |
+
1.0575,0.228310502283105
|
7 |
+
0.9934,0.273972602739726
|
8 |
+
0.9816,0.319634703196347
|
9 |
+
1.0224,0.365296803652968
|
10 |
+
0.9772,0.410958904109589
|
11 |
+
0.9784,0.45662100456621
|
12 |
+
0.9552,0.502283105022831
|
13 |
+
0.9833,0.547945205479452
|
14 |
+
0.9604,0.593607305936073
|
15 |
+
0.9575,0.639269406392694
|
16 |
+
0.8762,0.684931506849315
|
17 |
+
0.9512,0.730593607305936
|
18 |
+
0.9127,0.776255707762557
|
19 |
+
0.9121,0.821917808219178
|
20 |
+
0.8988,0.867579908675799
|
21 |
+
0.88,0.91324200913242
|
22 |
+
0.8618,0.958904109589041
|
23 |
+
0.8521,1.004566210045662
|
24 |
+
0.696,1.0502283105022832
|
25 |
+
0.6865,1.095890410958904
|
26 |
+
0.6996,1.1415525114155252
|
27 |
+
0.6876,1.187214611872146
|
28 |
+
0.7277,1.2328767123287672
|
29 |
+
0.6717,1.278538812785388
|
30 |
+
0.6934,1.3242009132420092
|
31 |
+
0.6777,1.36986301369863
|
32 |
+
0.6646,1.4155251141552512
|
33 |
+
0.7085,1.461187214611872
|
34 |
+
0.6761,1.5068493150684932
|
35 |
+
0.6948,1.5525114155251143
|
36 |
+
0.7275,1.5981735159817352
|
37 |
+
0.6744,1.643835616438356
|
38 |
+
0.6833,1.6894977168949772
|
39 |
+
0.6678,1.7351598173515983
|
40 |
+
0.6787,1.7808219178082192
|
41 |
+
0.6791,1.82648401826484
|
42 |
+
0.7341,1.8721461187214612
|
43 |
+
0.6752,1.9178082191780823
|
44 |
+
0.7042,1.9634703196347032
|
45 |
+
0.6583,2.009132420091324
|
46 |
+
0.502,2.0547945205479454
|
47 |
+
0.5152,2.1004566210045663
|
48 |
+
0.5054,2.146118721461187
|
49 |
+
0.4923,2.191780821917808
|
50 |
+
0.5066,2.237442922374429
|
51 |
+
0.5125,2.2831050228310503
|
52 |
+
0.5285,2.328767123287671
|
53 |
+
0.522,2.374429223744292
|
54 |
+
0.531,2.4200913242009134
|
55 |
+
0.5315,2.4657534246575343
|
56 |
+
0.5128,2.5114155251141552
|
57 |
+
0.5238,2.557077625570776
|
58 |
+
0.5042,2.602739726027397
|
59 |
+
0.513,2.6484018264840183
|
60 |
+
0.548,2.6940639269406392
|
61 |
+
0.5063,2.73972602739726
|
62 |
+
0.5324,2.7853881278538815
|
63 |
+
0.517,2.8310502283105023
|
64 |
+
0.5246,2.8767123287671232
|
65 |
+
0.4829,2.922374429223744
|
66 |
+
0.514,2.968036529680365
|
67 |
+
0.4639,3.0136986301369864
|
68 |
+
0.3824,3.0593607305936072
|
69 |
+
0.3976,3.105022831050228
|
70 |
+
0.3907,3.1506849315068495
|
71 |
+
0.3895,3.1963470319634704
|
72 |
+
0.3711,3.2420091324200913
|
73 |
+
0.3741,3.287671232876712
|
74 |
+
0.3739,3.3333333333333335
|
75 |
+
0.3796,3.3789954337899544
|
76 |
+
0.4116,3.4246575342465753
|
77 |
+
0.3786,3.470319634703196
|
78 |
+
0.3814,3.5159817351598175
|
79 |
+
0.3941,3.5616438356164384
|
80 |
+
0.4143,3.6073059360730593
|
81 |
+
0.3815,3.65296803652968
|
82 |
+
0.3875,3.6986301369863015
|
83 |
+
0.3592,3.7442922374429224
|
84 |
+
0.3686,3.7899543378995433
|
85 |
+
0.3978,3.8356164383561646
|
86 |
+
0.4005,3.8812785388127855
|
87 |
+
0.3878,3.9269406392694064
|
88 |
+
0.366,3.9726027397260273
|
89 |
+
0.3548,4.018264840182648
|
90 |
+
0.2972,4.063926940639269
|
91 |
+
0.2741,4.109589041095891
|
92 |
+
0.2896,4.155251141552512
|
93 |
+
0.2915,4.200913242009133
|
94 |
+
0.303,4.2465753424657535
|
95 |
+
0.2812,4.292237442922374
|
96 |
+
0.3036,4.337899543378995
|
97 |
+
0.2744,4.383561643835616
|
98 |
+
0.2912,4.429223744292237
|
99 |
+
0.2685,4.474885844748858
|
100 |
+
0.2725,4.52054794520548
|
101 |
+
0.3112,4.566210045662101
|
102 |
+
0.2664,4.6118721461187215
|
103 |
+
0.2796,4.657534246575342
|
104 |
+
0.3061,4.703196347031963
|
105 |
+
0.3056,4.748858447488584
|
106 |
+
0.2894,4.794520547945205
|
107 |
+
0.2958,4.840182648401827
|
108 |
+
0.3091,4.885844748858448
|
109 |
+
0.2921,4.931506849315069
|
110 |
+
0.3139,4.9771689497716896
|
111 |
+
0.580738905815229,5.0
|