tyzhu commited on
Commit
d71355f
·
verified ·
1 Parent(s): e57b6cb

End of training

Browse files
Files changed (5) hide show
  1. README.md +14 -2
  2. all_results.json +21 -0
  3. eval_results.json +16 -0
  4. train_results.json +8 -0
  5. trainer_state.json +1002 -0
README.md CHANGED
@@ -3,11 +3,23 @@ license: mit
3
  base_model: gpt2-xl
4
  tags:
5
  - generated_from_trainer
 
 
6
  metrics:
7
  - accuracy
8
  model-index:
9
  - name: lmind_hotpot_train8000_eval7405_v1_recite_qa_gpt2-xl
10
- results: []
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -15,7 +27,7 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  # lmind_hotpot_train8000_eval7405_v1_recite_qa_gpt2-xl
17
 
18
- This model is a fine-tuned version of [gpt2-xl](https://huggingface.co/gpt2-xl) on an unknown dataset.
19
  It achieves the following results on the evaluation set:
20
  - Loss: 0.4650
21
  - Accuracy: 0.7664
 
3
  base_model: gpt2-xl
4
  tags:
5
  - generated_from_trainer
6
+ datasets:
7
+ - tyzhu/lmind_hotpot_train8000_eval7405_v1_recite_qa
8
  metrics:
9
  - accuracy
10
  model-index:
11
  - name: lmind_hotpot_train8000_eval7405_v1_recite_qa_gpt2-xl
12
+ results:
13
+ - task:
14
+ name: Causal Language Modeling
15
+ type: text-generation
16
+ dataset:
17
+ name: tyzhu/lmind_hotpot_train8000_eval7405_v1_recite_qa
18
+ type: tyzhu/lmind_hotpot_train8000_eval7405_v1_recite_qa
19
+ metrics:
20
+ - name: Accuracy
21
+ type: accuracy
22
+ value: 0.7664424114149346
23
  ---
24
 
25
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
27
 
28
  # lmind_hotpot_train8000_eval7405_v1_recite_qa_gpt2-xl
29
 
30
+ This model is a fine-tuned version of [gpt2-xl](https://huggingface.co/gpt2-xl) on the tyzhu/lmind_hotpot_train8000_eval7405_v1_recite_qa dataset.
31
  It achieves the following results on the evaluation set:
32
  - Loss: 0.4650
33
  - Accuracy: 0.7664
all_results.json ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 20.0,
3
+ "eval_accuracy": 0.7664424114149346,
4
+ "eval_exact_match": 20.432140445644833,
5
+ "eval_f1": 27.38396738548711,
6
+ "eval_loss": 0.4650193452835083,
7
+ "eval_qa_bleu": 16.030026926999476,
8
+ "eval_qa_exact_match": 0.19324780553679946,
9
+ "eval_recite_bleu": 50.43675034450751,
10
+ "eval_recite_exact_match": 0.053207292370020254,
11
+ "eval_runtime": 221.8832,
12
+ "eval_samples": 7405,
13
+ "eval_samples_per_second": 33.373,
14
+ "eval_steps_per_second": 2.087,
15
+ "perplexity": 1.592044987150834,
16
+ "train_loss": 0.0060400176945494205,
17
+ "train_runtime": 6746.5616,
18
+ "train_samples": 34854,
19
+ "train_samples_per_second": 103.324,
20
+ "train_steps_per_second": 6.46
21
+ }
eval_results.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 20.0,
3
+ "eval_accuracy": 0.7664424114149346,
4
+ "eval_exact_match": 20.432140445644833,
5
+ "eval_f1": 27.38396738548711,
6
+ "eval_loss": 0.4650193452835083,
7
+ "eval_qa_bleu": 16.030026926999476,
8
+ "eval_qa_exact_match": 0.19324780553679946,
9
+ "eval_recite_bleu": 50.43675034450751,
10
+ "eval_recite_exact_match": 0.053207292370020254,
11
+ "eval_runtime": 221.8832,
12
+ "eval_samples": 7405,
13
+ "eval_samples_per_second": 33.373,
14
+ "eval_steps_per_second": 2.087,
15
+ "perplexity": 1.592044987150834
16
+ }
train_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 20.0,
3
+ "train_loss": 0.0060400176945494205,
4
+ "train_runtime": 6746.5616,
5
+ "train_samples": 34854,
6
+ "train_samples_per_second": 103.324,
7
+ "train_steps_per_second": 6.46
8
+ }
trainer_state.json ADDED
@@ -0,0 +1,1002 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 20.0,
5
+ "eval_steps": 2179,
6
+ "global_step": 43580,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 0.2,
13
+ "learning_rate": 3e-05,
14
+ "loss": 2.2464,
15
+ "step": 436
16
+ },
17
+ {
18
+ "epoch": 0.4,
19
+ "learning_rate": 3e-05,
20
+ "loss": 2.0544,
21
+ "step": 872
22
+ },
23
+ {
24
+ "epoch": 0.6,
25
+ "learning_rate": 3e-05,
26
+ "loss": 1.9968,
27
+ "step": 1308
28
+ },
29
+ {
30
+ "epoch": 0.8,
31
+ "learning_rate": 3e-05,
32
+ "loss": 1.9537,
33
+ "step": 1744
34
+ },
35
+ {
36
+ "epoch": 1.0,
37
+ "eval_accuracy": 0.6693020521036767,
38
+ "eval_loss": 1.5623117685317993,
39
+ "eval_runtime": 220.2436,
40
+ "eval_samples_per_second": 33.622,
41
+ "eval_steps_per_second": 2.102,
42
+ "step": 2179
43
+ },
44
+ {
45
+ "epoch": 1.0,
46
+ "eval_exact_match": 10.24983119513842,
47
+ "eval_f1": 15.257097040945366,
48
+ "eval_qa_bleu": 5.8382901471491575,
49
+ "eval_qa_exact_match": 0.09736664415935178,
50
+ "eval_recite_bleu": 17.0930417063702,
51
+ "eval_recite_exact_match": 0.0,
52
+ "step": 2179
53
+ },
54
+ {
55
+ "epoch": 1.0,
56
+ "learning_rate": 3e-05,
57
+ "loss": 1.9029,
58
+ "step": 2180
59
+ },
60
+ {
61
+ "epoch": 1.2,
62
+ "learning_rate": 3e-05,
63
+ "loss": 1.5215,
64
+ "step": 2616
65
+ },
66
+ {
67
+ "epoch": 1.4,
68
+ "learning_rate": 3e-05,
69
+ "loss": 1.5077,
70
+ "step": 3052
71
+ },
72
+ {
73
+ "epoch": 1.6,
74
+ "learning_rate": 3e-05,
75
+ "loss": 1.5016,
76
+ "step": 3488
77
+ },
78
+ {
79
+ "epoch": 1.8,
80
+ "learning_rate": 3e-05,
81
+ "loss": 1.4734,
82
+ "step": 3924
83
+ },
84
+ {
85
+ "epoch": 2.0,
86
+ "eval_accuracy": 0.6923618412861704,
87
+ "eval_loss": 1.2099343538284302,
88
+ "eval_runtime": 222.1542,
89
+ "eval_samples_per_second": 33.333,
90
+ "eval_steps_per_second": 2.084,
91
+ "step": 4358
92
+ },
93
+ {
94
+ "epoch": 2.0,
95
+ "eval_exact_match": 10.61444969615125,
96
+ "eval_f1": 15.935672519396675,
97
+ "eval_qa_bleu": 6.524750946372503,
98
+ "eval_qa_exact_match": 0.10209318028359217,
99
+ "eval_recite_bleu": 18.618913529891735,
100
+ "eval_recite_exact_match": 0.0,
101
+ "step": 4358
102
+ },
103
+ {
104
+ "epoch": 2.0,
105
+ "learning_rate": 3e-05,
106
+ "loss": 1.4425,
107
+ "step": 4360
108
+ },
109
+ {
110
+ "epoch": 2.2,
111
+ "learning_rate": 3e-05,
112
+ "loss": 1.0827,
113
+ "step": 4796
114
+ },
115
+ {
116
+ "epoch": 2.4,
117
+ "learning_rate": 3e-05,
118
+ "loss": 1.0876,
119
+ "step": 5232
120
+ },
121
+ {
122
+ "epoch": 2.6,
123
+ "learning_rate": 3e-05,
124
+ "loss": 1.0887,
125
+ "step": 5668
126
+ },
127
+ {
128
+ "epoch": 2.8,
129
+ "learning_rate": 3e-05,
130
+ "loss": 1.0665,
131
+ "step": 6104
132
+ },
133
+ {
134
+ "epoch": 3.0,
135
+ "eval_accuracy": 0.7147465927772421,
136
+ "eval_loss": 0.9177509546279907,
137
+ "eval_runtime": 226.6441,
138
+ "eval_samples_per_second": 32.672,
139
+ "eval_steps_per_second": 2.043,
140
+ "step": 6537
141
+ },
142
+ {
143
+ "epoch": 3.0,
144
+ "eval_exact_match": 12.397029034436192,
145
+ "eval_f1": 17.721572805797255,
146
+ "eval_qa_bleu": 8.139751517788653,
147
+ "eval_qa_exact_match": 0.11978392977717758,
148
+ "eval_recite_bleu": 21.769859861022653,
149
+ "eval_recite_exact_match": 0.0,
150
+ "step": 6537
151
+ },
152
+ {
153
+ "epoch": 3.0,
154
+ "learning_rate": 3e-05,
155
+ "loss": 0.7772,
156
+ "step": 6540
157
+ },
158
+ {
159
+ "epoch": 3.2,
160
+ "learning_rate": 3e-05,
161
+ "loss": 0.7688,
162
+ "step": 6976
163
+ },
164
+ {
165
+ "epoch": 3.4,
166
+ "learning_rate": 3e-05,
167
+ "loss": 0.7598,
168
+ "step": 7412
169
+ },
170
+ {
171
+ "epoch": 3.6,
172
+ "learning_rate": 3e-05,
173
+ "loss": 0.77,
174
+ "step": 7848
175
+ },
176
+ {
177
+ "epoch": 3.8,
178
+ "learning_rate": 3e-05,
179
+ "loss": 0.7684,
180
+ "step": 8284
181
+ },
182
+ {
183
+ "epoch": 4.0,
184
+ "eval_accuracy": 0.7331474586575765,
185
+ "eval_loss": 0.6987847089767456,
186
+ "eval_runtime": 221.7249,
187
+ "eval_samples_per_second": 33.397,
188
+ "eval_steps_per_second": 2.088,
189
+ "step": 8716
190
+ },
191
+ {
192
+ "epoch": 4.0,
193
+ "eval_exact_match": 13.409858203916272,
194
+ "eval_f1": 19.326947132558665,
195
+ "eval_qa_bleu": 8.79837393539171,
196
+ "eval_qa_exact_match": 0.12869682646860228,
197
+ "eval_recite_bleu": 23.848732661358156,
198
+ "eval_recite_exact_match": 0.0,
199
+ "step": 8716
200
+ },
201
+ {
202
+ "epoch": 4.0,
203
+ "learning_rate": 3e-05,
204
+ "loss": 0.7559,
205
+ "step": 8720
206
+ },
207
+ {
208
+ "epoch": 4.2,
209
+ "learning_rate": 3e-05,
210
+ "loss": 0.5354,
211
+ "step": 9156
212
+ },
213
+ {
214
+ "epoch": 4.4,
215
+ "learning_rate": 3e-05,
216
+ "loss": 0.5511,
217
+ "step": 9592
218
+ },
219
+ {
220
+ "epoch": 4.6,
221
+ "learning_rate": 3e-05,
222
+ "loss": 0.5496,
223
+ "step": 10028
224
+ },
225
+ {
226
+ "epoch": 4.8,
227
+ "learning_rate": 3e-05,
228
+ "loss": 0.548,
229
+ "step": 10464
230
+ },
231
+ {
232
+ "epoch": 5.0,
233
+ "eval_accuracy": 0.7465929589970037,
234
+ "eval_loss": 0.5567022562026978,
235
+ "eval_runtime": 226.4909,
236
+ "eval_samples_per_second": 32.694,
237
+ "eval_steps_per_second": 2.044,
238
+ "step": 10895
239
+ },
240
+ {
241
+ "epoch": 5.0,
242
+ "eval_exact_match": 14.220121539500338,
243
+ "eval_f1": 20.736856637999697,
244
+ "eval_qa_bleu": 9.542509177303282,
245
+ "eval_qa_exact_match": 0.13652937204591492,
246
+ "eval_recite_bleu": 27.315073242003514,
247
+ "eval_recite_exact_match": 0.0009453072248480756,
248
+ "step": 10895
249
+ },
250
+ {
251
+ "epoch": 5.0,
252
+ "learning_rate": 3e-05,
253
+ "loss": 0.5518,
254
+ "step": 10900
255
+ },
256
+ {
257
+ "epoch": 5.2,
258
+ "learning_rate": 3e-05,
259
+ "loss": 0.3837,
260
+ "step": 11336
261
+ },
262
+ {
263
+ "epoch": 5.4,
264
+ "learning_rate": 3e-05,
265
+ "loss": 0.3963,
266
+ "step": 11772
267
+ },
268
+ {
269
+ "epoch": 5.6,
270
+ "learning_rate": 3e-05,
271
+ "loss": 0.404,
272
+ "step": 12208
273
+ },
274
+ {
275
+ "epoch": 5.8,
276
+ "learning_rate": 3e-05,
277
+ "loss": 0.4039,
278
+ "step": 12644
279
+ },
280
+ {
281
+ "epoch": 6.0,
282
+ "eval_accuracy": 0.7551407100982113,
283
+ "eval_loss": 0.4728190004825592,
284
+ "eval_runtime": 223.8973,
285
+ "eval_samples_per_second": 33.073,
286
+ "eval_steps_per_second": 2.068,
287
+ "step": 13074
288
+ },
289
+ {
290
+ "epoch": 6.0,
291
+ "eval_exact_match": 16.12424037812289,
292
+ "eval_f1": 23.0965386152669,
293
+ "eval_qa_bleu": 10.706681466541065,
294
+ "eval_qa_exact_match": 0.1524645509790682,
295
+ "eval_recite_bleu": 31.825156559426375,
296
+ "eval_recite_exact_match": 0.0036461850101282916,
297
+ "step": 13074
298
+ },
299
+ {
300
+ "epoch": 6.0,
301
+ "learning_rate": 3e-05,
302
+ "loss": 0.405,
303
+ "step": 13080
304
+ },
305
+ {
306
+ "epoch": 6.2,
307
+ "learning_rate": 3e-05,
308
+ "loss": 0.285,
309
+ "step": 13516
310
+ },
311
+ {
312
+ "epoch": 6.4,
313
+ "learning_rate": 3e-05,
314
+ "loss": 0.2984,
315
+ "step": 13952
316
+ },
317
+ {
318
+ "epoch": 6.6,
319
+ "learning_rate": 3e-05,
320
+ "loss": 0.3051,
321
+ "step": 14388
322
+ },
323
+ {
324
+ "epoch": 6.8,
325
+ "learning_rate": 3e-05,
326
+ "loss": 0.3044,
327
+ "step": 14824
328
+ },
329
+ {
330
+ "epoch": 7.0,
331
+ "eval_accuracy": 0.7600033806411041,
332
+ "eval_loss": 0.4375591278076172,
333
+ "eval_runtime": 225.3993,
334
+ "eval_samples_per_second": 32.853,
335
+ "eval_steps_per_second": 2.054,
336
+ "step": 15253
337
+ },
338
+ {
339
+ "epoch": 7.0,
340
+ "eval_exact_match": 16.948008102633356,
341
+ "eval_f1": 23.931422337180113,
342
+ "eval_qa_bleu": 11.914541362683222,
343
+ "eval_qa_exact_match": 0.16029709655638083,
344
+ "eval_recite_bleu": 35.91072342079755,
345
+ "eval_recite_exact_match": 0.010938555030384874,
346
+ "step": 15253
347
+ },
348
+ {
349
+ "epoch": 7.0,
350
+ "learning_rate": 3e-05,
351
+ "loss": 0.2046,
352
+ "step": 15260
353
+ },
354
+ {
355
+ "epoch": 7.2,
356
+ "learning_rate": 3e-05,
357
+ "loss": 0.2265,
358
+ "step": 15696
359
+ },
360
+ {
361
+ "epoch": 7.4,
362
+ "learning_rate": 3e-05,
363
+ "loss": 0.236,
364
+ "step": 16132
365
+ },
366
+ {
367
+ "epoch": 7.6,
368
+ "learning_rate": 3e-05,
369
+ "loss": 0.2399,
370
+ "step": 16568
371
+ },
372
+ {
373
+ "epoch": 7.8,
374
+ "learning_rate": 3e-05,
375
+ "loss": 0.2446,
376
+ "step": 17004
377
+ },
378
+ {
379
+ "epoch": 8.0,
380
+ "eval_accuracy": 0.7628389388058522,
381
+ "eval_loss": 0.42204150557518005,
382
+ "eval_runtime": 223.5169,
383
+ "eval_samples_per_second": 33.129,
384
+ "eval_steps_per_second": 2.071,
385
+ "step": 17432
386
+ },
387
+ {
388
+ "epoch": 8.0,
389
+ "eval_exact_match": 18.433490884537473,
390
+ "eval_f1": 25.859990277646197,
391
+ "eval_qa_bleu": 13.293068057422849,
392
+ "eval_qa_exact_match": 0.17609723160027008,
393
+ "eval_recite_bleu": 41.25950959223269,
394
+ "eval_recite_exact_match": 0.015530047265361242,
395
+ "step": 17432
396
+ },
397
+ {
398
+ "epoch": 8.0,
399
+ "learning_rate": 3e-05,
400
+ "loss": 0.2466,
401
+ "step": 17440
402
+ },
403
+ {
404
+ "epoch": 8.2,
405
+ "learning_rate": 3e-05,
406
+ "loss": 0.1887,
407
+ "step": 17876
408
+ },
409
+ {
410
+ "epoch": 8.4,
411
+ "learning_rate": 3e-05,
412
+ "loss": 0.1967,
413
+ "step": 18312
414
+ },
415
+ {
416
+ "epoch": 8.6,
417
+ "learning_rate": 3e-05,
418
+ "loss": 0.2026,
419
+ "step": 18748
420
+ },
421
+ {
422
+ "epoch": 8.8,
423
+ "learning_rate": 3e-05,
424
+ "loss": 0.2039,
425
+ "step": 19184
426
+ },
427
+ {
428
+ "epoch": 9.0,
429
+ "eval_accuracy": 0.7641864696201232,
430
+ "eval_loss": 0.41897761821746826,
431
+ "eval_runtime": 222.9387,
432
+ "eval_samples_per_second": 33.215,
433
+ "eval_steps_per_second": 2.077,
434
+ "step": 19611
435
+ },
436
+ {
437
+ "epoch": 9.0,
438
+ "eval_exact_match": 18.933153274814316,
439
+ "eval_f1": 26.522905186169627,
440
+ "eval_qa_bleu": 12.546175820192783,
441
+ "eval_qa_exact_match": 0.1797434166103984,
442
+ "eval_recite_bleu": 44.27793915167843,
443
+ "eval_recite_exact_match": 0.02228224172856178,
444
+ "step": 19611
445
+ },
446
+ {
447
+ "epoch": 9.0,
448
+ "learning_rate": 3e-05,
449
+ "loss": 0.2067,
450
+ "step": 19620
451
+ },
452
+ {
453
+ "epoch": 9.2,
454
+ "learning_rate": 3e-05,
455
+ "loss": 0.1667,
456
+ "step": 20056
457
+ },
458
+ {
459
+ "epoch": 9.4,
460
+ "learning_rate": 3e-05,
461
+ "loss": 0.1709,
462
+ "step": 20492
463
+ },
464
+ {
465
+ "epoch": 9.6,
466
+ "learning_rate": 3e-05,
467
+ "loss": 0.1774,
468
+ "step": 20928
469
+ },
470
+ {
471
+ "epoch": 9.8,
472
+ "learning_rate": 3e-05,
473
+ "loss": 0.1787,
474
+ "step": 21364
475
+ },
476
+ {
477
+ "epoch": 10.0,
478
+ "eval_accuracy": 0.7648945867029934,
479
+ "eval_loss": 0.4250437319278717,
480
+ "eval_runtime": 225.6304,
481
+ "eval_samples_per_second": 32.819,
482
+ "eval_steps_per_second": 2.052,
483
+ "step": 21790
484
+ },
485
+ {
486
+ "epoch": 10.0,
487
+ "eval_exact_match": 20.054017555705606,
488
+ "eval_f1": 27.607870062960497,
489
+ "eval_qa_bleu": 14.597537360428676,
490
+ "eval_qa_exact_match": 0.1913571910871033,
491
+ "eval_recite_bleu": 46.3428836838782,
492
+ "eval_recite_exact_match": 0.029169480081026333,
493
+ "step": 21790
494
+ },
495
+ {
496
+ "epoch": 10.0,
497
+ "learning_rate": 3e-05,
498
+ "loss": 0.1829,
499
+ "step": 21800
500
+ },
501
+ {
502
+ "epoch": 10.2,
503
+ "learning_rate": 3e-05,
504
+ "loss": 0.1498,
505
+ "step": 22236
506
+ },
507
+ {
508
+ "epoch": 10.4,
509
+ "learning_rate": 3e-05,
510
+ "loss": 0.1553,
511
+ "step": 22672
512
+ },
513
+ {
514
+ "epoch": 10.6,
515
+ "learning_rate": 3e-05,
516
+ "loss": 0.1612,
517
+ "step": 23108
518
+ },
519
+ {
520
+ "epoch": 10.8,
521
+ "learning_rate": 3e-05,
522
+ "loss": 0.1652,
523
+ "step": 23544
524
+ },
525
+ {
526
+ "epoch": 11.0,
527
+ "eval_accuracy": 0.7653962302216591,
528
+ "eval_loss": 0.42946067452430725,
529
+ "eval_runtime": 227.2445,
530
+ "eval_samples_per_second": 32.586,
531
+ "eval_steps_per_second": 2.037,
532
+ "step": 23969
533
+ },
534
+ {
535
+ "epoch": 11.0,
536
+ "eval_exact_match": 19.581363943281566,
537
+ "eval_f1": 27.25909178622117,
538
+ "eval_qa_bleu": 14.227446461856255,
539
+ "eval_qa_exact_match": 0.18595543551654287,
540
+ "eval_recite_bleu": 46.797968452871594,
541
+ "eval_recite_exact_match": 0.03430114787305875,
542
+ "step": 23969
543
+ },
544
+ {
545
+ "epoch": 11.01,
546
+ "learning_rate": 3e-05,
547
+ "loss": 0.1439,
548
+ "step": 23980
549
+ },
550
+ {
551
+ "epoch": 11.21,
552
+ "learning_rate": 3e-05,
553
+ "loss": 0.1402,
554
+ "step": 24416
555
+ },
556
+ {
557
+ "epoch": 11.41,
558
+ "learning_rate": 3e-05,
559
+ "loss": 0.1462,
560
+ "step": 24852
561
+ },
562
+ {
563
+ "epoch": 11.61,
564
+ "learning_rate": 3e-05,
565
+ "loss": 0.1492,
566
+ "step": 25288
567
+ },
568
+ {
569
+ "epoch": 11.81,
570
+ "learning_rate": 3e-05,
571
+ "loss": 0.154,
572
+ "step": 25724
573
+ },
574
+ {
575
+ "epoch": 12.0,
576
+ "eval_accuracy": 0.7654645700633325,
577
+ "eval_loss": 0.43657177686691284,
578
+ "eval_runtime": 220.7713,
579
+ "eval_samples_per_second": 33.541,
580
+ "eval_steps_per_second": 2.097,
581
+ "step": 26148
582
+ },
583
+ {
584
+ "epoch": 12.0,
585
+ "eval_exact_match": 19.945982444294394,
586
+ "eval_f1": 27.398959656999555,
587
+ "eval_qa_bleu": 13.50737485018698,
588
+ "eval_qa_exact_match": 0.1886563133018231,
589
+ "eval_recite_bleu": 47.72099251148126,
590
+ "eval_recite_exact_match": 0.036326806212018906,
591
+ "step": 26148
592
+ },
593
+ {
594
+ "epoch": 12.01,
595
+ "learning_rate": 3e-05,
596
+ "loss": 0.1569,
597
+ "step": 26160
598
+ },
599
+ {
600
+ "epoch": 12.21,
601
+ "learning_rate": 3e-05,
602
+ "loss": 0.1344,
603
+ "step": 26596
604
+ },
605
+ {
606
+ "epoch": 12.41,
607
+ "learning_rate": 3e-05,
608
+ "loss": 0.14,
609
+ "step": 27032
610
+ },
611
+ {
612
+ "epoch": 12.61,
613
+ "learning_rate": 3e-05,
614
+ "loss": 0.1427,
615
+ "step": 27468
616
+ },
617
+ {
618
+ "epoch": 12.81,
619
+ "learning_rate": 3e-05,
620
+ "loss": 0.1441,
621
+ "step": 27904
622
+ },
623
+ {
624
+ "epoch": 13.0,
625
+ "eval_accuracy": 0.7656634099218181,
626
+ "eval_loss": 0.44285184144973755,
627
+ "eval_runtime": 222.075,
628
+ "eval_samples_per_second": 33.345,
629
+ "eval_steps_per_second": 2.085,
630
+ "step": 28327
631
+ },
632
+ {
633
+ "epoch": 13.0,
634
+ "eval_exact_match": 20.405131667792034,
635
+ "eval_f1": 28.457441663021587,
636
+ "eval_qa_bleu": 14.596564409389117,
637
+ "eval_qa_exact_match": 0.19446320054017555,
638
+ "eval_recite_bleu": 48.14075990339527,
639
+ "eval_recite_exact_match": 0.03902768399729912,
640
+ "step": 28327
641
+ },
642
+ {
643
+ "epoch": 13.01,
644
+ "learning_rate": 3e-05,
645
+ "loss": 0.1502,
646
+ "step": 28340
647
+ },
648
+ {
649
+ "epoch": 13.21,
650
+ "learning_rate": 3e-05,
651
+ "loss": 0.1288,
652
+ "step": 28776
653
+ },
654
+ {
655
+ "epoch": 13.41,
656
+ "learning_rate": 3e-05,
657
+ "loss": 0.1334,
658
+ "step": 29212
659
+ },
660
+ {
661
+ "epoch": 13.61,
662
+ "learning_rate": 3e-05,
663
+ "loss": 0.1389,
664
+ "step": 29648
665
+ },
666
+ {
667
+ "epoch": 13.81,
668
+ "learning_rate": 3e-05,
669
+ "loss": 0.143,
670
+ "step": 30084
671
+ },
672
+ {
673
+ "epoch": 14.0,
674
+ "eval_accuracy": 0.7656877650781592,
675
+ "eval_loss": 0.44177931547164917,
676
+ "eval_runtime": 226.2366,
677
+ "eval_samples_per_second": 32.731,
678
+ "eval_steps_per_second": 2.047,
679
+ "step": 30506
680
+ },
681
+ {
682
+ "epoch": 14.0,
683
+ "eval_exact_match": 20.324105334233625,
684
+ "eval_f1": 28.258140680396128,
685
+ "eval_qa_bleu": 14.043080335083465,
686
+ "eval_qa_exact_match": 0.19324780553679946,
687
+ "eval_recite_bleu": 49.992383315936785,
688
+ "eval_recite_exact_match": 0.042403781228899394,
689
+ "step": 30506
690
+ },
691
+ {
692
+ "epoch": 14.01,
693
+ "learning_rate": 3e-05,
694
+ "loss": 0.1434,
695
+ "step": 30520
696
+ },
697
+ {
698
+ "epoch": 14.21,
699
+ "learning_rate": 3e-05,
700
+ "loss": 0.1265,
701
+ "step": 30956
702
+ },
703
+ {
704
+ "epoch": 14.41,
705
+ "learning_rate": 3e-05,
706
+ "loss": 0.1291,
707
+ "step": 31392
708
+ },
709
+ {
710
+ "epoch": 14.61,
711
+ "learning_rate": 3e-05,
712
+ "loss": 0.1346,
713
+ "step": 31828
714
+ },
715
+ {
716
+ "epoch": 14.81,
717
+ "learning_rate": 3e-05,
718
+ "loss": 0.1366,
719
+ "step": 32264
720
+ },
721
+ {
722
+ "epoch": 15.0,
723
+ "eval_accuracy": 0.7660609078838914,
724
+ "eval_loss": 0.44691887497901917,
725
+ "eval_runtime": 226.6588,
726
+ "eval_samples_per_second": 32.67,
727
+ "eval_steps_per_second": 2.043,
728
+ "step": 32685
729
+ },
730
+ {
731
+ "epoch": 15.0,
732
+ "eval_exact_match": 19.878460499662392,
733
+ "eval_f1": 27.391401943316232,
734
+ "eval_qa_bleu": 14.761384752422014,
735
+ "eval_qa_exact_match": 0.18717083051991898,
736
+ "eval_recite_bleu": 49.40428435717536,
737
+ "eval_recite_exact_match": 0.04659014179608373,
738
+ "step": 32685
739
+ },
740
+ {
741
+ "epoch": 15.01,
742
+ "learning_rate": 3e-05,
743
+ "loss": 0.1234,
744
+ "step": 32700
745
+ },
746
+ {
747
+ "epoch": 15.21,
748
+ "learning_rate": 3e-05,
749
+ "loss": 0.1235,
750
+ "step": 33136
751
+ },
752
+ {
753
+ "epoch": 15.41,
754
+ "learning_rate": 3e-05,
755
+ "loss": 0.1285,
756
+ "step": 33572
757
+ },
758
+ {
759
+ "epoch": 15.61,
760
+ "learning_rate": 3e-05,
761
+ "loss": 0.1296,
762
+ "step": 34008
763
+ },
764
+ {
765
+ "epoch": 15.81,
766
+ "learning_rate": 3e-05,
767
+ "loss": 0.1335,
768
+ "step": 34444
769
+ },
770
+ {
771
+ "epoch": 16.0,
772
+ "eval_accuracy": 0.7660959865792183,
773
+ "eval_loss": 0.45172181725502014,
774
+ "eval_runtime": 220.6741,
775
+ "eval_samples_per_second": 33.556,
776
+ "eval_steps_per_second": 2.098,
777
+ "step": 34864
778
+ },
779
+ {
780
+ "epoch": 16.0,
781
+ "eval_exact_match": 19.648885887913572,
782
+ "eval_f1": 27.41473279396055,
783
+ "eval_qa_bleu": 14.081874080604702,
784
+ "eval_qa_exact_match": 0.1863605671843349,
785
+ "eval_recite_bleu": 49.64601003868678,
786
+ "eval_recite_exact_match": 0.044699527346387574,
787
+ "step": 34864
788
+ },
789
+ {
790
+ "epoch": 16.01,
791
+ "learning_rate": 3e-05,
792
+ "loss": 0.1335,
793
+ "step": 34880
794
+ },
795
+ {
796
+ "epoch": 16.21,
797
+ "learning_rate": 3e-05,
798
+ "loss": 0.1195,
799
+ "step": 35316
800
+ },
801
+ {
802
+ "epoch": 16.41,
803
+ "learning_rate": 3e-05,
804
+ "loss": 0.1245,
805
+ "step": 35752
806
+ },
807
+ {
808
+ "epoch": 16.61,
809
+ "learning_rate": 3e-05,
810
+ "loss": 0.1278,
811
+ "step": 36188
812
+ },
813
+ {
814
+ "epoch": 16.81,
815
+ "learning_rate": 3e-05,
816
+ "loss": 0.1299,
817
+ "step": 36624
818
+ },
819
+ {
820
+ "epoch": 17.0,
821
+ "eval_accuracy": 0.7661592372837458,
822
+ "eval_loss": 0.45338499546051025,
823
+ "eval_runtime": 221.2554,
824
+ "eval_samples_per_second": 33.468,
825
+ "eval_steps_per_second": 2.093,
826
+ "step": 37043
827
+ },
828
+ {
829
+ "epoch": 17.0,
830
+ "eval_exact_match": 19.60837272113437,
831
+ "eval_f1": 27.45798777202767,
832
+ "eval_qa_bleu": 14.536144503898182,
833
+ "eval_qa_exact_match": 0.1850101282916948,
834
+ "eval_recite_bleu": 49.273704799153755,
835
+ "eval_recite_exact_match": 0.048345712356515864,
836
+ "step": 37043
837
+ },
838
+ {
839
+ "epoch": 17.01,
840
+ "learning_rate": 3e-05,
841
+ "loss": 0.1325,
842
+ "step": 37060
843
+ },
844
+ {
845
+ "epoch": 17.21,
846
+ "learning_rate": 3e-05,
847
+ "loss": 0.1188,
848
+ "step": 37496
849
+ },
850
+ {
851
+ "epoch": 17.41,
852
+ "learning_rate": 3e-05,
853
+ "loss": 0.1226,
854
+ "step": 37932
855
+ },
856
+ {
857
+ "epoch": 17.61,
858
+ "learning_rate": 3e-05,
859
+ "loss": 0.1263,
860
+ "step": 38368
861
+ },
862
+ {
863
+ "epoch": 17.81,
864
+ "learning_rate": 3e-05,
865
+ "loss": 0.1271,
866
+ "step": 38804
867
+ },
868
+ {
869
+ "epoch": 18.0,
870
+ "eval_accuracy": 0.7663929740826603,
871
+ "eval_loss": 0.45788267254829407,
872
+ "eval_runtime": 224.1369,
873
+ "eval_samples_per_second": 33.038,
874
+ "eval_steps_per_second": 2.066,
875
+ "step": 39222
876
+ },
877
+ {
878
+ "epoch": 18.0,
879
+ "eval_exact_match": 19.959486833220797,
880
+ "eval_f1": 27.5331537130017,
881
+ "eval_qa_bleu": 14.102203160745331,
882
+ "eval_qa_exact_match": 0.1900067521944632,
883
+ "eval_recite_bleu": 50.42598107564746,
884
+ "eval_recite_exact_match": 0.05172180958811614,
885
+ "step": 39222
886
+ },
887
+ {
888
+ "epoch": 18.01,
889
+ "learning_rate": 3e-05,
890
+ "loss": 0.1298,
891
+ "step": 39240
892
+ },
893
+ {
894
+ "epoch": 18.21,
895
+ "learning_rate": 3e-05,
896
+ "loss": 0.1152,
897
+ "step": 39676
898
+ },
899
+ {
900
+ "epoch": 18.41,
901
+ "learning_rate": 3e-05,
902
+ "loss": 0.119,
903
+ "step": 40112
904
+ },
905
+ {
906
+ "epoch": 18.61,
907
+ "learning_rate": 3e-05,
908
+ "loss": 0.1251,
909
+ "step": 40548
910
+ },
911
+ {
912
+ "epoch": 18.81,
913
+ "learning_rate": 3e-05,
914
+ "loss": 0.1268,
915
+ "step": 40984
916
+ },
917
+ {
918
+ "epoch": 19.0,
919
+ "eval_accuracy": 0.7664202373173704,
920
+ "eval_loss": 0.4556055963039398,
921
+ "eval_runtime": 225.0055,
922
+ "eval_samples_per_second": 32.91,
923
+ "eval_steps_per_second": 2.058,
924
+ "step": 41401
925
+ },
926
+ {
927
+ "epoch": 19.0,
928
+ "eval_exact_match": 21.02633355840648,
929
+ "eval_f1": 28.383806355797034,
930
+ "eval_qa_bleu": 15.303943702882195,
931
+ "eval_qa_exact_match": 0.19905469277515192,
932
+ "eval_recite_bleu": 51.33952244561918,
933
+ "eval_recite_exact_match": 0.05442268737339635,
934
+ "step": 41401
935
+ },
936
+ {
937
+ "epoch": 19.01,
938
+ "learning_rate": 3e-05,
939
+ "loss": 0.1152,
940
+ "step": 41420
941
+ },
942
+ {
943
+ "epoch": 19.21,
944
+ "learning_rate": 3e-05,
945
+ "loss": 0.1157,
946
+ "step": 41856
947
+ },
948
+ {
949
+ "epoch": 19.41,
950
+ "learning_rate": 3e-05,
951
+ "loss": 0.1176,
952
+ "step": 42292
953
+ },
954
+ {
955
+ "epoch": 19.61,
956
+ "learning_rate": 3e-05,
957
+ "loss": 0.1214,
958
+ "step": 42728
959
+ },
960
+ {
961
+ "epoch": 19.81,
962
+ "learning_rate": 3e-05,
963
+ "loss": 0.1238,
964
+ "step": 43164
965
+ },
966
+ {
967
+ "epoch": 20.0,
968
+ "eval_accuracy": 0.7664424114149346,
969
+ "eval_loss": 0.4650193452835083,
970
+ "eval_runtime": 223.8156,
971
+ "eval_samples_per_second": 33.085,
972
+ "eval_steps_per_second": 2.069,
973
+ "step": 43580
974
+ },
975
+ {
976
+ "epoch": 20.0,
977
+ "eval_exact_match": 20.432140445644833,
978
+ "eval_f1": 27.38396738548711,
979
+ "eval_qa_bleu": 16.030026926999476,
980
+ "eval_qa_exact_match": 0.19324780553679946,
981
+ "eval_recite_bleu": 50.43675034450751,
982
+ "eval_recite_exact_match": 0.053207292370020254,
983
+ "step": 43580
984
+ },
985
+ {
986
+ "epoch": 20.0,
987
+ "step": 43580,
988
+ "total_flos": 2.01650366300928e+18,
989
+ "train_loss": 0.0060400176945494205,
990
+ "train_runtime": 6746.5616,
991
+ "train_samples_per_second": 103.324,
992
+ "train_steps_per_second": 6.46
993
+ }
994
+ ],
995
+ "logging_steps": 436,
996
+ "max_steps": 43580,
997
+ "num_train_epochs": 20,
998
+ "save_steps": 500,
999
+ "total_flos": 2.01650366300928e+18,
1000
+ "trial_name": null,
1001
+ "trial_params": null
1002
+ }