tyzhu
/

lmind_nq_train600_eval300_v1_recite_qa_gpt2-xl

@@ -3,23 +3,11 @@ license: mit
 base_model: gpt2-xl
 tags:
 - generated_from_trainer
-datasets:
-- tyzhu/lmind_nq_train600_eval300_v1_recite_qa
 metrics:
 - accuracy
 model-index:
 - name: lmind_nq_train600_eval300_v1_recite_qa_gpt2-xl
-  results:
-  - task:
-      name: Causal Language Modeling
-      type: text-generation
-    dataset:
-      name: tyzhu/lmind_nq_train600_eval300_v1_recite_qa
-      type: tyzhu/lmind_nq_train600_eval300_v1_recite_qa
-    metrics:
-    - name: Accuracy
-      type: accuracy
-      value: 0.841281045751634
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -27,9 +15,9 @@ should probably proofread and complete it, then remove this comment. -->
 # lmind_nq_train600_eval300_v1_recite_qa_gpt2-xl
-This model is a fine-tuned version of [gpt2-xl](https://huggingface.co/gpt2-xl) on the tyzhu/lmind_nq_train600_eval300_v1_recite_qa dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.3552
 - Accuracy: 0.8413
 ## Model description
@@ -55,31 +43,51 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
-- num_epochs: 10.0
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Accuracy |
-|:-------------:|:-----:|:----:|:---------------:|:--------:|
-| 2.4392        | 0.51  | 47   | 2.1629          | 0.5938   |
-| 2.2015        | 1.01  | 94   | 1.8377          | 0.6250   |
-| 1.5193        | 1.52  | 141  | 1.8218          | 0.6524   |
-| 1.405         | 2.02  | 188  | 1.2418          | 0.6983   |
-| 0.8654        | 2.53  | 235  | 1.0606          | 0.7264   |
-| 0.7158        | 3.03  | 282  | 0.7902          | 0.7665   |
-| 0.4185        | 3.54  | 329  | 0.6745          | 0.7867   |
-| 0.4556        | 4.04  | 376  | 0.5201          | 0.8105   |
-| 0.253         | 4.55  | 423  | 0.4699          | 0.8198   |
-| 0.1895        | 5.05  | 470  | 0.4096          | 0.8301   |
-| 0.1649        | 5.56  | 517  | 0.3918          | 0.8323   |
-| 0.1468        | 6.06  | 564  | 0.3684          | 0.8367   |
-| 0.1179        | 6.57  | 611  | 0.3645          | 0.8377   |
-| 0.1171        | 7.08  | 658  | 0.3590          | 0.8394   |
-| 0.0953        | 7.58  | 705  | 0.3537          | 0.8402   |
-| 0.0831        | 8.09  | 752  | 0.3615          | 0.8400   |
-| 0.0824        | 8.59  | 799  | 0.3527          | 0.8408   |
-| 0.0764        | 9.1   | 846  | 0.3586          | 0.8410   |
-| 0.0731        | 9.6   | 893  | 0.3589          | 0.8408   |
 ### Framework versions

 base_model: gpt2-xl
 tags:
 - generated_from_trainer
 metrics:
 - accuracy
 model-index:
 - name: lmind_nq_train600_eval300_v1_recite_qa_gpt2-xl
+  results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # lmind_nq_train600_eval300_v1_recite_qa_gpt2-xl
+This model is a fine-tuned version of [gpt2-xl](https://huggingface.co/gpt2-xl) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.3638
 - Accuracy: 0.8413
 ## Model description
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
+- num_epochs: 20.0
 ### Training results
+| Training Loss | Epoch | Step | Accuracy | Validation Loss |
+|:-------------:|:-----:|:----:|:--------:|:---------------:|
+| 2.4392        | 0.51  | 47   | 0.5938   | 2.1629          |
+| 2.2015        | 1.01  | 94   | 0.6250   | 1.8377          |
+| 1.5193        | 1.52  | 141  | 0.6524   | 1.8218          |
+| 1.405         | 2.02  | 188  | 0.6983   | 1.2418          |
+| 0.8654        | 2.53  | 235  | 0.7264   | 1.0606          |
+| 0.7158        | 3.03  | 282  | 0.7665   | 0.7902          |
+| 0.4185        | 3.54  | 329  | 0.7867   | 0.6745          |
+| 0.4556        | 4.04  | 376  | 0.8105   | 0.5201          |
+| 0.253         | 4.55  | 423  | 0.8198   | 0.4699          |
+| 0.1895        | 5.05  | 470  | 0.8301   | 0.4096          |
+| 0.1649        | 5.56  | 517  | 0.8323   | 0.3918          |
+| 0.1468        | 6.06  | 564  | 0.8367   | 0.3684          |
+| 0.1179        | 6.57  | 611  | 0.8377   | 0.3645          |
+| 0.1171        | 7.08  | 658  | 0.8394   | 0.3590          |
+| 0.0953        | 7.58  | 705  | 0.8402   | 0.3537          |
+| 0.0831        | 8.09  | 752  | 0.8400   | 0.3615          |
+| 0.0824        | 8.59  | 799  | 0.8408   | 0.3527          |
+| 0.0764        | 9.1   | 846  | 0.8410   | 0.3586          |
+| 0.0731        | 9.6   | 893  | 0.8408   | 0.3589          |
+| 0.0637        | 10.11 | 940  | 0.3618   | 0.8412          |
+| 0.069         | 10.61 | 987  | 0.3563   | 0.8415          |
+| 0.0636        | 11.12 | 1034 | 0.3650   | 0.8409          |
+| 0.0663        | 11.62 | 1081 | 0.3554   | 0.8418          |
+| 0.0636        | 12.13 | 1128 | 0.3675   | 0.8409          |
+| 0.0632        | 12.63 | 1175 | 0.3590   | 0.8418          |
+| 0.0605        | 13.14 | 1222 | 0.3635   | 0.8413          |
+| 0.0621        | 13.65 | 1269 | 0.3558   | 0.8413          |
+| 0.0579        | 14.15 | 1316 | 0.3682   | 0.8410          |
+| 0.0618        | 14.66 | 1363 | 0.3653   | 0.8405          |
+| 0.0552        | 15.16 | 1410 | 0.3661   | 0.8413          |
+| 0.0619        | 15.67 | 1457 | 0.3596   | 0.8416          |
+| 0.0536        | 16.17 | 1504 | 0.3710   | 0.8414          |
+| 0.0602        | 16.68 | 1551 | 0.3609   | 0.8418          |
+| 0.054         | 17.18 | 1598 | 0.3759   | 0.8410          |
+| 0.0635        | 17.69 | 1645 | 0.3597   | 0.8414          |
+| 0.0536        | 18.19 | 1692 | 0.3750   | 0.8410          |
+| 0.0588        | 18.7  | 1739 | 0.3684   | 0.8414          |
+| 0.0713        | 19.2  | 1786 | 0.3691   | 0.8411          |
+| 0.0704        | 19.71 | 1833 | 0.3638   | 0.8413          |
 ### Framework versions

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:83824ec995c6529762b6535e63512aabd5e3f31377ab61e7396e4f94aa5df53a
 size 6230637102

 version https://git-lfs.github.com/spec/v1
+oid sha256:61ed51b164e98ba9303881ebaa6a9f1b4939a0edd647a9e5654a453eef41baa5
 size 6230637102