przybytniowska
commited on
Commit
•
1d5c979
1
Parent(s):
e4fb4ba
Training complete
Browse files- README.md +54 -0
- test_metrics.json +3 -0
- train_losses.csv +127 -0
README.md
ADDED
@@ -0,0 +1,54 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
base_model: FacebookAI/roberta-base
|
4 |
+
tags:
|
5 |
+
- generated_from_trainer
|
6 |
+
datasets:
|
7 |
+
- arrow
|
8 |
+
model-index:
|
9 |
+
- name: roberta_base_QA_SQUAD_adamw_torch
|
10 |
+
results: []
|
11 |
+
---
|
12 |
+
|
13 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
14 |
+
should probably proofread and complete it, then remove this comment. -->
|
15 |
+
|
16 |
+
# roberta_base_QA_SQUAD_adamw_torch
|
17 |
+
|
18 |
+
This model is a fine-tuned version of [FacebookAI/roberta-base](https://huggingface.co/FacebookAI/roberta-base) on the arrow dataset.
|
19 |
+
|
20 |
+
## Model description
|
21 |
+
|
22 |
+
More information needed
|
23 |
+
|
24 |
+
## Intended uses & limitations
|
25 |
+
|
26 |
+
More information needed
|
27 |
+
|
28 |
+
## Training and evaluation data
|
29 |
+
|
30 |
+
More information needed
|
31 |
+
|
32 |
+
## Training procedure
|
33 |
+
|
34 |
+
### Training hyperparameters
|
35 |
+
|
36 |
+
The following hyperparameters were used during training:
|
37 |
+
- learning_rate: 2e-05
|
38 |
+
- train_batch_size: 32
|
39 |
+
- eval_batch_size: 8
|
40 |
+
- seed: 42
|
41 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
42 |
+
- lr_scheduler_type: linear
|
43 |
+
- num_epochs: 5
|
44 |
+
|
45 |
+
### Training results
|
46 |
+
|
47 |
+
|
48 |
+
|
49 |
+
### Framework versions
|
50 |
+
|
51 |
+
- Transformers 4.34.1
|
52 |
+
- Pytorch 2.3.0+cu118
|
53 |
+
- Datasets 2.19.0
|
54 |
+
- Tokenizers 0.14.1
|
test_metrics.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"test_accuracy": 0.952
|
3 |
+
}
|
train_losses.csv
ADDED
@@ -0,0 +1,127 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
loss,epoch
|
2 |
+
0.2402,0.04
|
3 |
+
0.1457,0.08
|
4 |
+
0.1223,0.12
|
5 |
+
0.103,0.16
|
6 |
+
0.0809,0.2
|
7 |
+
0.0672,0.24
|
8 |
+
0.0642,0.28
|
9 |
+
0.049,0.32
|
10 |
+
0.0394,0.36
|
11 |
+
0.0378,0.4
|
12 |
+
0.0315,0.44
|
13 |
+
0.0283,0.48
|
14 |
+
0.0203,0.52
|
15 |
+
0.0187,0.56
|
16 |
+
0.023,0.6
|
17 |
+
0.0219,0.64
|
18 |
+
0.0184,0.68
|
19 |
+
0.0206,0.72
|
20 |
+
0.016,0.76
|
21 |
+
0.0102,0.8
|
22 |
+
0.014,0.84
|
23 |
+
0.0113,0.88
|
24 |
+
0.0119,0.92
|
25 |
+
0.0118,0.96
|
26 |
+
0.0125,1.0
|
27 |
+
0.0143,1.04
|
28 |
+
0.0128,1.08
|
29 |
+
0.0132,1.12
|
30 |
+
0.011,1.16
|
31 |
+
0.0094,1.2
|
32 |
+
0.0086,1.24
|
33 |
+
0.0104,1.28
|
34 |
+
0.0082,1.32
|
35 |
+
0.0063,1.36
|
36 |
+
0.0079,1.4
|
37 |
+
0.006,1.44
|
38 |
+
0.0065,1.48
|
39 |
+
0.011,1.52
|
40 |
+
0.0073,1.56
|
41 |
+
0.0053,1.6
|
42 |
+
0.0058,1.64
|
43 |
+
0.006,1.68
|
44 |
+
0.0053,1.72
|
45 |
+
0.0108,1.76
|
46 |
+
0.0092,1.8
|
47 |
+
0.0044,1.84
|
48 |
+
0.0045,1.88
|
49 |
+
0.007,1.92
|
50 |
+
0.0054,1.96
|
51 |
+
0.0037,2.0
|
52 |
+
0.0061,2.04
|
53 |
+
0.0036,2.08
|
54 |
+
0.0036,2.12
|
55 |
+
0.004,2.16
|
56 |
+
0.006,2.2
|
57 |
+
0.0044,2.24
|
58 |
+
0.0046,2.28
|
59 |
+
0.0014,2.32
|
60 |
+
0.0075,2.36
|
61 |
+
0.0036,2.4
|
62 |
+
0.0033,2.44
|
63 |
+
0.003,2.48
|
64 |
+
0.0034,2.52
|
65 |
+
0.0024,2.56
|
66 |
+
0.0023,2.6
|
67 |
+
0.0014,2.64
|
68 |
+
0.0066,2.68
|
69 |
+
0.0034,2.72
|
70 |
+
0.0031,2.76
|
71 |
+
0.0012,2.8
|
72 |
+
0.0029,2.84
|
73 |
+
0.0016,2.88
|
74 |
+
0.0027,2.92
|
75 |
+
0.0012,2.96
|
76 |
+
0.002,3.0
|
77 |
+
0.0024,3.04
|
78 |
+
0.0012,3.08
|
79 |
+
0.0006,3.12
|
80 |
+
0.0032,3.16
|
81 |
+
0.0022,3.2
|
82 |
+
0.0008,3.24
|
83 |
+
0.0021,3.28
|
84 |
+
0.0004,3.32
|
85 |
+
0.0024,3.36
|
86 |
+
0.0001,3.4
|
87 |
+
0.0041,3.44
|
88 |
+
0.0007,3.48
|
89 |
+
0.0003,3.52
|
90 |
+
0.0008,3.56
|
91 |
+
0.0002,3.6
|
92 |
+
0.0031,3.64
|
93 |
+
0.0006,3.68
|
94 |
+
0.0013,3.72
|
95 |
+
0.0005,3.76
|
96 |
+
0.0,3.8
|
97 |
+
0.002,3.84
|
98 |
+
0.0011,3.88
|
99 |
+
0.0,3.92
|
100 |
+
0.0,3.96
|
101 |
+
0.0008,4.0
|
102 |
+
0.0,4.04
|
103 |
+
0.0,4.08
|
104 |
+
0.0,4.12
|
105 |
+
0.0,4.16
|
106 |
+
0.0005,4.2
|
107 |
+
0.0,4.24
|
108 |
+
0.0008,4.28
|
109 |
+
0.002,4.32
|
110 |
+
0.0004,4.36
|
111 |
+
0.0,4.4
|
112 |
+
0.0,4.44
|
113 |
+
0.0003,4.48
|
114 |
+
0.0,4.52
|
115 |
+
0.0,4.56
|
116 |
+
0.0004,4.6
|
117 |
+
0.0,4.64
|
118 |
+
0.0,4.68
|
119 |
+
0.0004,4.72
|
120 |
+
0.0,4.76
|
121 |
+
0.0,4.8
|
122 |
+
0.0,4.84
|
123 |
+
0.0006,4.88
|
124 |
+
0.0006,4.92
|
125 |
+
0.0,4.96
|
126 |
+
0.0,5.0
|
127 |
+
0.012337412198364735,5.0
|