fpadovani
/

de_childes_42

Fill-Mask

Transformers

Safetensors

roberta

Generated from Trainer

Model card Files Files and versions Community

fpadovani commited on 4 days ago

Commit

5efd772

verified ·

1 Parent(s): e27807a

Model save

Browse files

Files changed (2) hide show

README.md +51 -52
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 4.1897
 ## Model description
@@ -43,62 +43,61 @@ The following hyperparameters were used during training:
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 40000
 - training_steps: 100000
-- mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch   | Step   | Validation Loss |
 |:-------------:|:-------:|:------:|:---------------:|
-| No log        | 1.5021  | 2000   | 7.0336          |
-| 6.9417        | 3.0041  | 4000   | 5.8112          |
-| 6.9417        | 4.5062  | 6000   | 5.4586          |
-| 5.2077        | 6.0083  | 8000   | 5.1717          |
-| 5.2077        | 7.5103  | 10000  | 4.9583          |
-| 4.7315        | 9.0124  | 12000  | 4.8002          |
-| 4.7315        | 10.5145 | 14000  | 4.6706          |
-| 4.4253        | 12.0165 | 16000  | 4.5552          |
-| 4.4253        | 13.5186 | 18000  | 4.4522          |
-| 4.1962        | 15.0207 | 20000  | 4.3739          |
-| 4.1962        | 16.5227 | 22000  | 4.2925          |
-| 4.0105        | 18.0248 | 24000  | 4.2277          |
-| 4.0105        | 19.5268 | 26000  | 4.1706          |
-| 3.857         | 21.0289 | 28000  | 4.1245          |
-| 3.857         | 22.5310 | 30000  | 4.0846          |
-| 3.7298        | 24.0330 | 32000  | 4.0589          |
-| 3.7298        | 25.5351 | 34000  | 4.0266          |
-| 3.6218        | 27.0372 | 36000  | 4.0028          |
-| 3.6218        | 28.5392 | 38000  | 3.9863          |
-| 3.5278        | 30.0413 | 40000  | 3.9732          |
-| 3.5278        | 31.5434 | 42000  | 3.9720          |
-| 3.4351        | 33.0454 | 44000  | 3.9599          |
-| 3.4351        | 34.5475 | 46000  | 3.9566          |
-| 3.3444        | 36.0496 | 48000  | 3.9572          |
-| 3.3444        | 37.5516 | 50000  | 3.9680          |
-| 3.2651        | 39.0537 | 52000  | 3.9788          |
-| 3.2651        | 40.5558 | 54000  | 3.9815          |
-| 3.1966        | 42.0578 | 56000  | 3.9928          |
-| 3.1966        | 43.5599 | 58000  | 4.0061          |
-| 3.1344        | 45.0620 | 60000  | 4.0126          |
-| 3.1344        | 46.5640 | 62000  | 4.0198          |
-| 3.0785        | 48.0661 | 64000  | 4.0377          |
-| 3.0785        | 49.5682 | 66000  | 4.0502          |
-| 3.0287        | 51.0702 | 68000  | 4.0644          |
-| 3.0287        | 52.5723 | 70000  | 4.0714          |
-| 2.9837        | 54.0744 | 72000  | 4.0852          |
-| 2.9837        | 55.5764 | 74000  | 4.0964          |
-| 2.9422        | 57.0785 | 76000  | 4.1148          |
-| 2.9422        | 58.5805 | 78000  | 4.1221          |
-| 2.9052        | 60.0826 | 80000  | 4.1276          |
-| 2.9052        | 61.5847 | 82000  | 4.1346          |
-| 2.8708        | 63.0867 | 84000  | 4.1505          |
-| 2.8708        | 64.5888 | 86000  | 4.1574          |
-| 2.839         | 66.0909 | 88000  | 4.1675          |
-| 2.839         | 67.5929 | 90000  | 4.1727          |
-| 2.8117        | 69.0950 | 92000  | 4.1767          |
-| 2.8117        | 70.5971 | 94000  | 4.1823          |
-| 2.7886        | 72.0991 | 96000  | 4.1867          |
-| 2.7886        | 73.6012 | 98000  | 4.1872          |
-| 2.768         | 75.1033 | 100000 | 4.1897          |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.6072
 ## Model description
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 40000
 - training_steps: 100000
 ### Training results
 | Training Loss | Epoch   | Step   | Validation Loss |
 |:-------------:|:-------:|:------:|:---------------:|
+| No log        | 1.5021  | 2000   | 7.5052          |
+| 7.4063        | 3.0041  | 4000   | 6.4338          |
+| 7.4063        | 4.5062  | 6000   | 6.3051          |
+| 6.0672        | 6.0083  | 8000   | 6.1663          |
+| 6.0672        | 7.5103  | 10000  | 6.0952          |
+| 5.8691        | 9.0124  | 12000  | 6.0122          |
+| 5.8691        | 10.5145 | 14000  | 5.9512          |
+| 5.7209        | 12.0165 | 16000  | 5.8691          |
+| 5.7209        | 13.5186 | 18000  | 5.8529          |
+| 5.6105        | 15.0207 | 20000  | 5.7965          |
+| 5.6105        | 16.5227 | 22000  | 5.7404          |
+| 5.5302        | 18.0248 | 24000  | 5.7424          |
+| 5.5302        | 19.5268 | 26000  | 5.7256          |
+| 5.4587        | 21.0289 | 28000  | 5.6831          |
+| 5.4587        | 22.5310 | 30000  | 5.3966          |
+| 5.0899        | 24.0330 | 32000  | 4.7042          |
+| 5.0899        | 25.5351 | 34000  | 4.2317          |
+| 4.0988        | 27.0372 | 36000  | 3.9093          |
+| 4.0988        | 28.5494 | 38000  | 3.7496          |
+| 3.555         | 30.0514 | 40000  | 3.5961          |
+| 3.555         | 31.5535 | 42000  | 3.4542          |
+| 3.2522        | 33.0556 | 44000  | 3.3300          |
+| 3.2522        | 34.5576 | 46000  | 3.2830          |
+| 3.0484        | 36.0597 | 48000  | 3.1864          |
+| 3.0484        | 37.5618 | 50000  | 3.1189          |
+| 2.9026        | 39.0638 | 52000  | 3.0475          |
+| 2.9026        | 40.5659 | 54000  | 2.9933          |
+| 2.7874        | 42.0680 | 56000  | 2.9411          |
+| 2.7874        | 43.5700 | 58000  | 2.9355          |
+| 2.7001        | 45.0721 | 60000  | 2.8913          |
+| 2.7001        | 46.5742 | 62000  | 2.8601          |
+| 2.6298        | 48.0762 | 64000  | 2.8227          |
+| 2.6298        | 49.5783 | 66000  | 2.8202          |
+| 2.5722        | 51.0804 | 68000  | 2.7874          |
+| 2.5722        | 52.5824 | 70000  | 2.7716          |
+| 2.523         | 54.0845 | 72000  | 2.7363          |
+| 2.523         | 55.5866 | 74000  | 2.7212          |
+| 2.4788        | 57.0886 | 76000  | 2.6944          |
+| 2.4788        | 58.5907 | 78000  | 2.6761          |
+| 2.4466        | 60.0928 | 80000  | 2.6705          |
+| 2.4466        | 61.5948 | 82000  | 2.6551          |
+| 2.4122        | 63.0969 | 84000  | 2.6368          |
+| 2.4122        | 64.5989 | 86000  | 2.6424          |
+| 2.3832        | 66.1010 | 88000  | 2.6345          |
+| 2.3832        | 67.6031 | 90000  | 2.6295          |
+| 2.3592        | 69.1051 | 92000  | 2.6247          |
+| 2.3592        | 70.6072 | 94000  | 2.6303          |
+| 2.347         | 72.1093 | 96000  | 2.5866          |
+| 2.347         | 73.6113 | 98000  | 2.6067          |
+| 2.3336        | 75.1134 | 100000 | 2.6072          |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e20463828899f128c41fc953aaeebd36fea6da55296ae7e463adcc256059ce8c
 size 59702184

 version https://git-lfs.github.com/spec/v1
+oid sha256:0fedcd28fc45bdf67db1d349a56bcaeda98afe9c946dccf8985f8ad59076e7bb
 size 59702184