Update README.md
Browse files
README.md
CHANGED
@@ -43,7 +43,7 @@ pip install transformers
|
|
43 |
|
44 |
```python
|
45 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
46 |
-
checkpoint = "AssistantsLab/SmolLM2-
|
47 |
|
48 |
device = "cuda" # for GPU usage or "cpu" for CPU usage
|
49 |
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
|
@@ -62,7 +62,7 @@ print(tokenizer.decode(outputs[0]))
|
|
62 |
You can also use the TRL CLI to chat with the model from the terminal:
|
63 |
```bash
|
64 |
pip install trl
|
65 |
-
trl chat --model_name_or_path AssistantsLab/SmolLM2-
|
66 |
```
|
67 |
|
68 |
## Evaluation
|
@@ -73,16 +73,16 @@ In this section, we report the evaluation results of SmolLM2. All evaluations ar
|
|
73 |
|
74 |
| Metric | SmolLM2-135M-Instruct | SmolLM2-135M-Humanized | Difference |
|
75 |
|:-----------------------------|:---------------------:|:----------------------:|:----------:|
|
76 |
-
| MMLU |
|
77 |
-
| ARC (Easy) | **56.4** |
|
78 |
-
| ARC (Challenge) | **31.7** |
|
79 |
-
| HellaSwag | **56.9** |
|
80 |
-
| PIQA | **71.6** |
|
81 |
-
| WinoGrande |
|
82 |
-
| TriviaQA | 0.2
|
83 |
-
| GSM8K | 0.3
|
84 |
-
| OpenBookQA | **36.2** |
|
85 |
-
| QuAC (F1) |
|
86 |
|
87 |
|
88 |
## Limitations
|
|
|
43 |
|
44 |
```python
|
45 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
46 |
+
checkpoint = "AssistantsLab/SmolLM2-360M-humanized"
|
47 |
|
48 |
device = "cuda" # for GPU usage or "cpu" for CPU usage
|
49 |
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
|
|
|
62 |
You can also use the TRL CLI to chat with the model from the terminal:
|
63 |
```bash
|
64 |
pip install trl
|
65 |
+
trl chat --model_name_or_path AssistantsLab/SmolLM2-360M-humanized --device cpu
|
66 |
```
|
67 |
|
68 |
## Evaluation
|
|
|
73 |
|
74 |
| Metric | SmolLM2-135M-Instruct | SmolLM2-135M-Humanized | Difference |
|
75 |
|:-----------------------------|:---------------------:|:----------------------:|:----------:|
|
76 |
+
| MMLU | 24.0 | **25.4** | +1.4 |
|
77 |
+
| ARC (Easy) | **56.4** | 54.8 | -1.6 |
|
78 |
+
| ARC (Challenge) | **31.7** | 28.8 | -2.9 |
|
79 |
+
| HellaSwag | **56.9** | 54.9 | -2.0 |
|
80 |
+
| PIQA | **71.6** | 70.6 | -1.0 |
|
81 |
+
| WinoGrande | 54.8 | **54.9** | +0.1 |
|
82 |
+
| TriviaQA | **0.2** | **0.2** | +0.0 |
|
83 |
+
| GSM8K | **0.3** | 0.1 | -0.2 |
|
84 |
+
| OpenBookQA | **36.2** | 35.8 | -0.4 |
|
85 |
+
| QuAC (F1) | 23.0 | **23.7** | +0.7 |
|
86 |
|
87 |
|
88 |
## Limitations
|