Michielo commited on
Commit
d51d69c
·
verified ·
1 Parent(s): 776aad4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -13
README.md CHANGED
@@ -71,19 +71,19 @@ In this section, we report the evaluation results of SmolLM2. All evaluations ar
71
 
72
  ## Instruction model Vs. Humanized model
73
 
74
- | Metric | SmolLM2-135M-Instruct | SmolLM2-135M-Humanized |
75
- |:-----------------------------|:---------------------:|:----------------------:|
76
- | MMLU | **23.1** | **23.1** |
77
- | ARC (Easy) | **54.3** | 50.2 |
78
- | ARC (Challenge) | **26.1** | 25.3 |
79
- | HellaSwag | **43.0** | 41.6 |
80
- | PIQA | **67.2** | 66.2 |
81
- | WinoGrande | **52.5** | 52.2 |
82
- | TriviaQA | **0.3** | 0.1 |
83
- | GSM8K | 0.2 | **0.5** |
84
- | OpenBookQA | **32.6** | 32.0 |
85
- | CommonSenseQA | **4.8** | 2.2 |
86
- | QuAC (F1) | **14.1** | 11.0 |
87
 
88
 
89
  ## Limitations
 
71
 
72
  ## Instruction model Vs. Humanized model
73
 
74
+ | Metric | SmolLM2-135M-Instruct | SmolLM2-135M-Humanized | Difference |
75
+ |:-----------------------------|:---------------------:|:----------------------:|:----------:|
76
+ | MMLU | **23.1** | **23.1** | 0.0 |
77
+ | ARC (Easy) | **54.3** | 50.2 | -4.1 |
78
+ | ARC (Challenge) | **26.1** | 25.3 | -0.8 |
79
+ | HellaSwag | **43.0** | 41.6 | -1.4 |
80
+ | PIQA | **67.2** | 66.2 | -1.0 |
81
+ | WinoGrande | **52.5** | 52.2 | -0.3 |
82
+ | TriviaQA | **0.3** | 0.1 | -0.2 |
83
+ | GSM8K | 0.2 | **0.5** | +0.3 |
84
+ | OpenBookQA | **32.6** | 32.0 | -0.6 |
85
+ | CommonSenseQA | **4.8** | 2.2 | -2.6 |
86
+ | QuAC (F1) | **14.1** | 11.0 | -3.1 |
87
 
88
 
89
  ## Limitations