Model save

Browse files

Files changed (3) hide show

README.md +59 -39
generation_config.json +6 -0
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -3,18 +3,18 @@ library_name: transformers
 tags:
 - generated_from_trainer
 model-index:
-- name: childes_30
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# childes_30
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.8672
 ## Model description
@@ -33,7 +33,7 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 5e-05
 - train_batch_size: 16
 - eval_batch_size: 16
 - seed: 30
@@ -41,44 +41,64 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 100000
-- training_steps: 400000
 ### Training results
-| Training Loss | Epoch   | Step  | Validation Loss |
-|:-------------:|:-------:|:-----:|:---------------:|
-| No log        | 1.6502  | 2000  | 5.8888          |
-| 6.5372        | 3.3003  | 4000  | 5.8455          |
-| 6.5372        | 4.9505  | 6000  | 5.7730          |
-| 5.7792        | 6.6007  | 8000  | 5.6880          |
-| 5.7792        | 8.2508  | 10000 | 5.1237          |
-| 5.147         | 9.9010  | 12000 | 4.7844          |
-| 5.147         | 11.5512 | 14000 | 4.5138          |
-| 4.4762        | 13.2013 | 16000 | 4.0074          |
-| 4.4762        | 14.8515 | 18000 | 3.8195          |
-| 3.8668        | 16.5017 | 20000 | 3.6909          |
-| 3.8668        | 18.1518 | 22000 | 3.5956          |
-| 3.6038        | 19.8020 | 24000 | 3.4481          |
-| 3.6038        | 21.4521 | 26000 | 3.3742          |
-| 3.3867        | 23.1023 | 28000 | 3.2639          |
-| 3.3867        | 24.7525 | 30000 | 3.1954          |
-| 3.2158        | 26.4026 | 32000 | 3.1180          |
-| 3.2158        | 28.0528 | 34000 | 3.0905          |
-| 3.0917        | 29.7030 | 36000 | 3.0518          |
-| 3.0917        | 31.3531 | 38000 | 3.0318          |
-| 3.0074        | 33.0033 | 40000 | 2.9881          |
-| 3.0074        | 34.6535 | 42000 | 2.9401          |
-| 2.9357        | 36.3036 | 44000 | 2.9188          |
-| 2.9357        | 37.9538 | 46000 | 2.9058          |
-| 2.8951        | 39.6040 | 48000 | 2.9200          |
-| 2.8951        | 41.2541 | 50000 | 2.9114          |
-| 2.8637        | 42.9043 | 52000 | 2.8812          |
-| 2.8637        | 44.5545 | 54000 | 2.8914          |
-| 2.8428        | 46.2046 | 56000 | 2.8537          |
-| 2.8428        | 47.8548 | 58000 | 2.8884          |
-| 2.8272        | 49.5050 | 60000 | 2.8600          |
-| 2.8272        | 51.1551 | 62000 | 2.8672          |
 ### Framework versions

 tags:
 - generated_from_trainer
 model-index:
+- name: de_childes_30
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# de_childes_30
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.1942
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0001
 - train_batch_size: 16
 - eval_batch_size: 16
 - seed: 30
 - total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 40000
+- training_steps: 100000
+- mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch   | Step   | Validation Loss |
+|:-------------:|:-------:|:------:|:---------------:|
+| No log        | 1.5021  | 2000   | 7.0910          |
+| 6.9957        | 3.0041  | 4000   | 5.8267          |
+| 6.9957        | 4.5062  | 6000   | 5.4651          |
+| 5.213         | 6.0083  | 8000   | 5.1717          |
+| 5.213         | 7.5103  | 10000  | 4.9590          |
+| 4.7279        | 9.0124  | 12000  | 4.7918          |
+| 4.7279        | 10.5145 | 14000  | 4.6601          |
+| 4.4214        | 12.0165 | 16000  | 4.5524          |
+| 4.4214        | 13.5186 | 18000  | 4.4587          |
+| 4.195         | 15.0207 | 20000  | 4.3684          |
+| 4.195         | 16.5227 | 22000  | 4.2900          |
+| 4.0109        | 18.0248 | 24000  | 4.2277          |
+| 4.0109        | 19.5268 | 26000  | 4.1726          |
+| 3.8596        | 21.0289 | 28000  | 4.1288          |
+| 3.8596        | 22.5310 | 30000  | 4.0892          |
+| 3.7326        | 24.0330 | 32000  | 4.0589          |
+| 3.7326        | 25.5351 | 34000  | 4.0279          |
+| 3.6241        | 27.0372 | 36000  | 4.0029          |
+| 3.6241        | 28.5392 | 38000  | 3.9903          |
+| 3.5294        | 30.0413 | 40000  | 3.9732          |
+| 3.5294        | 31.5434 | 42000  | 3.9699          |
+| 3.4361        | 33.0454 | 44000  | 3.9619          |
+| 3.4361        | 34.5475 | 46000  | 3.9608          |
+| 3.3449        | 36.0496 | 48000  | 3.9666          |
+| 3.3449        | 37.5516 | 50000  | 3.9692          |
+| 3.2664        | 39.0537 | 52000  | 3.9745          |
+| 3.2664        | 40.5558 | 54000  | 3.9862          |
+| 3.1963        | 42.0578 | 56000  | 3.9972          |
+| 3.1963        | 43.5757 | 58000  | 4.0077          |
+| 3.1346        | 45.0777 | 60000  | 4.0204          |
+| 3.1346        | 46.5798 | 62000  | 4.0262          |
+| 3.0792        | 48.0819 | 64000  | 4.0405          |
+| 3.0792        | 49.5839 | 66000  | 4.0522          |
+| 3.0286        | 51.0860 | 68000  | 4.0700          |
+| 3.0286        | 52.5881 | 70000  | 4.0754          |
+| 2.9835        | 54.0901 | 72000  | 4.0929          |
+| 2.9835        | 55.5922 | 74000  | 4.1004          |
+| 2.942         | 57.0943 | 76000  | 4.1144          |
+| 2.942         | 58.5963 | 78000  | 4.1265          |
+| 2.9043        | 60.0984 | 80000  | 4.1325          |
+| 2.9043        | 61.6005 | 82000  | 4.1431          |
+| 2.8693        | 63.1025 | 84000  | 4.1530          |
+| 2.8693        | 64.6046 | 86000  | 4.1612          |
+| 2.8393        | 66.1066 | 88000  | 4.1716          |
+| 2.8393        | 67.6087 | 90000  | 4.1759          |
+| 2.8118        | 69.1108 | 92000  | 4.1815          |
+| 2.8118        | 70.6128 | 94000  | 4.1873          |
+| 2.7874        | 72.1149 | 96000  | 4.1914          |
+| 2.7874        | 73.6170 | 98000  | 4.1935          |
+| 2.7676        | 75.1190 | 100000 | 4.1942          |
 ### Framework versions

generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 0,
+  "eos_token_id": 1,
+  "transformers_version": "4.45.2"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c4055a337df2fd66617953610f35e72d1d1cbe9aae95395b2369475ef4520440
 size 51007160

 version https://git-lfs.github.com/spec/v1
+oid sha256:94cc04fdb17016a865f7f7ad4ccf2cf208eb60bf2d25dd7e6628078c8a9e6833
 size 51007160