End of training
Browse files- README.md +18 -0
- model.safetensors +1 -1
- tokenizer.json +2 -2
- training_args.bin +1 -1
README.md
CHANGED
@@ -14,6 +14,8 @@ should probably proofread and complete it, then remove this comment. -->
|
|
14 |
# mistral-0.5B-base
|
15 |
|
16 |
This model is a fine-tuned version of [Ellio98/mistral-0.5B-base](https://huggingface.co/Ellio98/mistral-0.5B-base) on an unknown dataset.
|
|
|
|
|
17 |
|
18 |
## Model description
|
19 |
|
@@ -43,6 +45,22 @@ The following hyperparameters were used during training:
|
|
43 |
- lr_scheduler_warmup_steps: 100
|
44 |
- num_epochs: 1
|
45 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
46 |
### Framework versions
|
47 |
|
48 |
- Transformers 4.47.0
|
|
|
14 |
# mistral-0.5B-base
|
15 |
|
16 |
This model is a fine-tuned version of [Ellio98/mistral-0.5B-base](https://huggingface.co/Ellio98/mistral-0.5B-base) on an unknown dataset.
|
17 |
+
It achieves the following results on the evaluation set:
|
18 |
+
- Loss: 2.0625
|
19 |
|
20 |
## Model description
|
21 |
|
|
|
45 |
- lr_scheduler_warmup_steps: 100
|
46 |
- num_epochs: 1
|
47 |
|
48 |
+
### Training results
|
49 |
+
|
50 |
+
| Training Loss | Epoch | Step | Validation Loss |
|
51 |
+
|:-------------:|:-----:|:----:|:---------------:|
|
52 |
+
| 2.4315 | 0.1 | 129 | 2.4184 |
|
53 |
+
| 2.3834 | 0.2 | 258 | 2.3990 |
|
54 |
+
| 2.4535 | 0.3 | 387 | 2.3923 |
|
55 |
+
| 2.2178 | 0.4 | 516 | 2.3198 |
|
56 |
+
| 2.3863 | 0.5 | 645 | 2.2612 |
|
57 |
+
| 2.2739 | 0.6 | 774 | 2.2014 |
|
58 |
+
| 2.0353 | 0.7 | 903 | 2.1402 |
|
59 |
+
| 2.1386 | 0.8 | 1032 | 2.0911 |
|
60 |
+
| 2.0759 | 0.9 | 1161 | 2.0672 |
|
61 |
+
| 2.1736 | 1.0 | 1290 | 2.0625 |
|
62 |
+
|
63 |
+
|
64 |
### Framework versions
|
65 |
|
66 |
- Transformers 4.47.0
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 2054379856
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7c00fb6335ef2b3b089ee315a68c6a89ee11c84075715fa298e20e22ddd1a3c3
|
3 |
size 2054379856
|
tokenizer.json
CHANGED
@@ -2,13 +2,13 @@
|
|
2 |
"version": "1.0",
|
3 |
"truncation": {
|
4 |
"direction": "Right",
|
5 |
-
"max_length":
|
6 |
"strategy": "LongestFirst",
|
7 |
"stride": 0
|
8 |
},
|
9 |
"padding": {
|
10 |
"strategy": {
|
11 |
-
"Fixed":
|
12 |
},
|
13 |
"direction": "Left",
|
14 |
"pad_to_multiple_of": null,
|
|
|
2 |
"version": "1.0",
|
3 |
"truncation": {
|
4 |
"direction": "Right",
|
5 |
+
"max_length": 512,
|
6 |
"strategy": "LongestFirst",
|
7 |
"stride": 0
|
8 |
},
|
9 |
"padding": {
|
10 |
"strategy": {
|
11 |
+
"Fixed": 512
|
12 |
},
|
13 |
"direction": "Left",
|
14 |
"pad_to_multiple_of": null,
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 5304
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9ee1aca07e2c000d45f1af350f62cf6b3c2038c85f5c343accf1e2064ef3cee1
|
3 |
size 5304
|