Ellio98
/

mistral-0.5B-base

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions Community

Ellio98 commited on Mar 7

Commit

3aaa02d

·

verified ·

1 Parent(s): 2212a58

End of training

Files changed (4) hide show

README.md +18 -0
model.safetensors +1 -1
tokenizer.json +2 -2
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -14,6 +14,8 @@ should probably proofread and complete it, then remove this comment. -->
 # mistral-0.5B-base
 This model is a fine-tuned version of [Ellio98/mistral-0.5B-base](https://huggingface.co/Ellio98/mistral-0.5B-base) on an unknown dataset.
 ## Model description
@@ -43,6 +45,22 @@ The following hyperparameters were used during training:
 - lr_scheduler_warmup_steps: 100
 - num_epochs: 1
 ### Framework versions
 - Transformers 4.47.0

 # mistral-0.5B-base
 This model is a fine-tuned version of [Ellio98/mistral-0.5B-base](https://huggingface.co/Ellio98/mistral-0.5B-base) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.0625
 ## Model description
 - lr_scheduler_warmup_steps: 100
 - num_epochs: 1
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 2.4315        | 0.1   | 129  | 2.4184          |
+| 2.3834        | 0.2   | 258  | 2.3990          |
+| 2.4535        | 0.3   | 387  | 2.3923          |
+| 2.2178        | 0.4   | 516  | 2.3198          |
+| 2.3863        | 0.5   | 645  | 2.2612          |
+| 2.2739        | 0.6   | 774  | 2.2014          |
+| 2.0353        | 0.7   | 903  | 2.1402          |
+| 2.1386        | 0.8   | 1032 | 2.0911          |
+| 2.0759        | 0.9   | 1161 | 2.0672          |
+| 2.1736        | 1.0   | 1290 | 2.0625          |
 ### Framework versions
 - Transformers 4.47.0

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:be9693d017d1d70e9b6d4d5a3a76dd2725ee2189dc1dc6439fe04b1a0e0e44ab
 size 2054379856

 version https://git-lfs.github.com/spec/v1
+oid sha256:7c00fb6335ef2b3b089ee315a68c6a89ee11c84075715fa298e20e22ddd1a3c3
 size 2054379856

tokenizer.json CHANGED Viewed

@@ -2,13 +2,13 @@
   "version": "1.0",
   "truncation": {
     "direction": "Right",
-    "max_length": 1024,
     "strategy": "LongestFirst",
     "stride": 0
   },
   "padding": {
     "strategy": {
-      "Fixed": 1024
     },
     "direction": "Left",
     "pad_to_multiple_of": null,

   "version": "1.0",
   "truncation": {
     "direction": "Right",
+    "max_length": 512,
     "strategy": "LongestFirst",
     "stride": 0
   },
   "padding": {
     "strategy": {
+      "Fixed": 512
     },
     "direction": "Left",
     "pad_to_multiple_of": null,

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f176388615956e18f4bfbb19a4d07e687b24a5adadc17174fb68f80e3b8da74d
 size 5304

 version https://git-lfs.github.com/spec/v1
+oid sha256:9ee1aca07e2c000d45f1af350f62cf6b3c2038c85f5c343accf1e2064ef3cee1
 size 5304