File size: 14,983 Bytes
b3e3307
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
---
language:
- en
- fr
- de
- es
- pt
- it
- ja
- ko
- ru
- zh
- ar
- fa
- id
- ms
- ne
- pl
- ro
- sr
- sv
- tr
- uk
- vi
- hi
- bn
license: apache-2.0
library_name: vllm
inference: false
---

# Model Card for Mistral-Small-3.1-24B-Base-2503 (TEXT ONLY)

This is the text-only variant of [mistralai/Mistral-Small-3.1-24B-Base-2503](https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Base-2503).
This also serves as the base-model for [mistralai/Devstral-Small-2505](https://huggingface.co/mistralai/Devstral-Small-2505), which had no official base model released.

Features:
- Text-only, no multimodality.
- 128k context length.

How was a text-only model achieved? The vision encoder was removed and the model architecture was converted from mistral3 to mistral. The tokenizer was not modified.

## Reproduced eval

Serve with vLLM:

```
vllm serve casperhansen/Mistral-Small-3.1-24B-Base-2503-Text-Only
```

The reproduced results can be seen below.

| Model                              | MMLU (0-shot)   |
|------------------------------------|-----------------|
| Small 3.1 24B Base (Text Only)     | 77.25% ± 0.0033 |
| Small 3.1 24B Base (Multimodal)    | 77.34% ± 0.0033 |

### Original Multimodal: Full MMLU (Reproduced)

```
lm_eval --model local-completions \
  --model_args "base_url=http://localhost:8000/v1/completions,model=mistralai/Mistral-Small-3.1-24B-Base-2503" \
  --tasks mmlu \
  --batch_size 128
```

|                 Tasks                 |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|---------------------------------------|------:|------|-----:|------|---|-----:|---|-----:|
|mmlu                                   |      2|none  |      |acc   |↑  |0.7734|±  |0.0033|
| - humanities                          |      2|none  |      |acc   |↑  |0.6820|±  |0.0062|
|  - formal_logic                       |      1|none  |     0|acc   |↑  |0.5714|±  |0.0443|
|  - high_school_european_history       |      1|none  |     0|acc   |↑  |0.8303|±  |0.0293|
|  - high_school_us_history             |      1|none  |     0|acc   |↑  |0.9363|±  |0.0171|
|  - high_school_world_history          |      1|none  |     0|acc   |↑  |0.9241|±  |0.0172|
|  - international_law                  |      1|none  |     0|acc   |↑  |0.9091|±  |0.0262|
|  - jurisprudence                      |      1|none  |     0|acc   |↑  |0.8148|±  |0.0376|
|  - logical_fallacies                  |      1|none  |     0|acc   |↑  |0.8589|±  |0.0274|
|  - moral_disputes                     |      1|none  |     0|acc   |↑  |0.8208|±  |0.0206|
|  - moral_scenarios                    |      1|none  |     0|acc   |↑  |0.3844|±  |0.0163|
|  - philosophy                         |      1|none  |     0|acc   |↑  |0.8296|±  |0.0214|
|  - prehistory                         |      1|none  |     0|acc   |↑  |0.8704|±  |0.0187|
|  - professional_law                   |      1|none  |     0|acc   |↑  |0.6095|±  |0.0125|
|  - world_religions                    |      1|none  |     0|acc   |↑  |0.8713|±  |0.0257|
| - other                               |      2|none  |      |acc   |↑  |0.8317|±  |0.0064|
|  - business_ethics                    |      1|none  |     0|acc   |↑  |0.8200|±  |0.0386|
|  - clinical_knowledge                 |      1|none  |     0|acc   |↑  |0.8679|±  |0.0208|
|  - college_medicine                   |      1|none  |     0|acc   |↑  |0.7803|±  |0.0316|
|  - global_facts                       |      1|none  |     0|acc   |↑  |0.6600|±  |0.0476|
|  - human_aging                        |      1|none  |     0|acc   |↑  |0.7982|±  |0.0269|
|  - management                         |      1|none  |     0|acc   |↑  |0.9029|±  |0.0293|
|  - marketing                          |      1|none  |     0|acc   |↑  |0.9359|±  |0.0160|
|  - medical_genetics                   |      1|none  |     0|acc   |↑  |0.8900|±  |0.0314|
|  - miscellaneous                      |      1|none  |     0|acc   |↑  |0.9183|±  |0.0098|
|  - nutrition                          |      1|none  |     0|acc   |↑  |0.8791|±  |0.0187|
|  - professional_accounting            |      1|none  |     0|acc   |↑  |0.6277|±  |0.0288|
|  - professional_medicine              |      1|none  |     0|acc   |↑  |0.8603|±  |0.0211|
|  - virology                           |      1|none  |     0|acc   |↑  |0.5602|±  |0.0386|
| - social sciences                     |      2|none  |      |acc   |↑  |0.8736|±  |0.0059|
|  - econometrics                       |      1|none  |     0|acc   |↑  |0.6491|±  |0.0449|
|  - high_school_geography              |      1|none  |     0|acc   |↑  |0.8990|±  |0.0215|
|  - high_school_government_and_politics|      1|none  |     0|acc   |↑  |0.9637|±  |0.0135|
|  - high_school_macroeconomics         |      1|none  |     0|acc   |↑  |0.8103|±  |0.0199|
|  - high_school_microeconomics         |      1|none  |     0|acc   |↑  |0.9034|±  |0.0192|
|  - high_school_psychology             |      1|none  |     0|acc   |↑  |0.9358|±  |0.0105|
|  - human_sexuality                    |      1|none  |     0|acc   |↑  |0.8855|±  |0.0279|
|  - professional_psychology            |      1|none  |     0|acc   |↑  |0.8578|±  |0.0141|
|  - public_relations                   |      1|none  |     0|acc   |↑  |0.7909|±  |0.0390|
|  - security_studies                   |      1|none  |     0|acc   |↑  |0.8327|±  |0.0239|
|  - sociology                          |      1|none  |     0|acc   |↑  |0.9154|±  |0.0197|
|  - us_foreign_policy                  |      1|none  |     0|acc   |↑  |0.9300|±  |0.0256|
| - stem                                |      2|none  |      |acc   |↑  |0.7545|±  |0.0073|
|  - abstract_algebra                   |      1|none  |     0|acc   |↑  |0.4600|±  |0.0501|
|  - anatomy                            |      1|none  |     0|acc   |↑  |0.8148|±  |0.0336|
|  - astronomy                          |      1|none  |     0|acc   |↑  |0.9211|±  |0.0219|
|  - college_biology                    |      1|none  |     0|acc   |↑  |0.9444|±  |0.0192|
|  - college_chemistry                  |      1|none  |     0|acc   |↑  |0.5700|±  |0.0498|
|  - college_computer_science           |      1|none  |     0|acc   |↑  |0.7100|±  |0.0456|
|  - college_mathematics                |      1|none  |     0|acc   |↑  |0.6200|±  |0.0488|
|  - college_physics                    |      1|none  |     0|acc   |↑  |0.6569|±  |0.0472|
|  - computer_security                  |      1|none  |     0|acc   |↑  |0.8300|±  |0.0378|
|  - conceptual_physics                 |      1|none  |     0|acc   |↑  |0.8170|±  |0.0253|
|  - electrical_engineering             |      1|none  |     0|acc   |↑  |0.7931|±  |0.0338|
|  - elementary_mathematics             |      1|none  |     0|acc   |↑  |0.7910|±  |0.0209|
|  - high_school_biology                |      1|none  |     0|acc   |↑  |0.9323|±  |0.0143|
|  - high_school_chemistry              |      1|none  |     0|acc   |↑  |0.7586|±  |0.0301|
|  - high_school_computer_science       |      1|none  |     0|acc   |↑  |0.8900|±  |0.0314|
|  - high_school_mathematics            |      1|none  |     0|acc   |↑  |0.5185|±  |0.0305|
|  - high_school_physics                |      1|none  |     0|acc   |↑  |0.6291|±  |0.0394|
|  - high_school_statistics             |      1|none  |     0|acc   |↑  |0.7593|±  |0.0292|
|  - machine_learning                   |      1|none  |     0|acc   |↑  |0.6250|±  |0.0460|

|      Groups      |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|------------------|------:|------|------|------|---|-----:|---|-----:|
|mmlu              |      2|none  |      |acc   |↑  |0.7734|±  |0.0033|
| - humanities     |      2|none  |      |acc   |↑  |0.6820|±  |0.0062|
| - other          |      2|none  |      |acc   |↑  |0.8317|±  |0.0064|
| - social sciences|      2|none  |      |acc   |↑  |0.8736|±  |0.0059|
| - stem           |      2|none  |      |acc   |↑  |0.7545|±  |0.0073|

### Text Only: Full MMLU

```
lm_eval --model local-completions \
  --model_args "base_url=http://localhost:8000/v1/completions,model=casperhansen/Mistral-Small-3.1-24B-Base-2503-Text-Only" \
  --tasks mmlu \
  --batch_size 128
```

|                 Tasks                 |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|---------------------------------------|------:|------|-----:|------|---|-----:|---|-----:|
|mmlu                                   |      2|none  |      |acc   |↑  |0.7725|±  |0.0033|
| - humanities                          |      2|none  |      |acc   |↑  |0.6793|±  |0.0062|
|  - formal_logic                       |      1|none  |     0|acc   |↑  |0.5397|±  |0.0446|
|  - high_school_european_history       |      1|none  |     0|acc   |↑  |0.8364|±  |0.0289|
|  - high_school_us_history             |      1|none  |     0|acc   |↑  |0.9363|±  |0.0171|
|  - high_school_world_history          |      1|none  |     0|acc   |↑  |0.9198|±  |0.0177|
|  - international_law                  |      1|none  |     0|acc   |↑  |0.9008|±  |0.0273|
|  - jurisprudence                      |      1|none  |     0|acc   |↑  |0.8148|±  |0.0376|
|  - logical_fallacies                  |      1|none  |     0|acc   |↑  |0.8405|±  |0.0288|
|  - moral_disputes                     |      1|none  |     0|acc   |↑  |0.8237|±  |0.0205|
|  - moral_scenarios                    |      1|none  |     0|acc   |↑  |0.3765|±  |0.0162|
|  - philosophy                         |      1|none  |     0|acc   |↑  |0.8264|±  |0.0215|
|  - prehistory                         |      1|none  |     0|acc   |↑  |0.8704|±  |0.0187|
|  - professional_law                   |      1|none  |     0|acc   |↑  |0.6108|±  |0.0125|
|  - world_religions                    |      1|none  |     0|acc   |↑  |0.8713|±  |0.0257|
| - other                               |      2|none  |      |acc   |↑  |0.8339|±  |0.0064|
|  - business_ethics                    |      1|none  |     0|acc   |↑  |0.8300|±  |0.0378|
|  - clinical_knowledge                 |      1|none  |     0|acc   |↑  |0.8679|±  |0.0208|
|  - college_medicine                   |      1|none  |     0|acc   |↑  |0.7746|±  |0.0319|
|  - global_facts                       |      1|none  |     0|acc   |↑  |0.6800|±  |0.0469|
|  - human_aging                        |      1|none  |     0|acc   |↑  |0.8027|±  |0.0267|
|  - management                         |      1|none  |     0|acc   |↑  |0.9029|±  |0.0293|
|  - marketing                          |      1|none  |     0|acc   |↑  |0.9402|±  |0.0155|
|  - medical_genetics                   |      1|none  |     0|acc   |↑  |0.8900|±  |0.0314|
|  - miscellaneous                      |      1|none  |     0|acc   |↑  |0.9208|±  |0.0097|
|  - nutrition                          |      1|none  |     0|acc   |↑  |0.8791|±  |0.0187|
|  - professional_accounting            |      1|none  |     0|acc   |↑  |0.6312|±  |0.0288|
|  - professional_medicine              |      1|none  |     0|acc   |↑  |0.8603|±  |0.0211|
|  - virology                           |      1|none  |     0|acc   |↑  |0.5602|±  |0.0386|
| - social sciences                     |      2|none  |      |acc   |↑  |0.8739|±  |0.0059|
|  - econometrics                       |      1|none  |     0|acc   |↑  |0.6667|±  |0.0443|
|  - high_school_geography              |      1|none  |     0|acc   |↑  |0.8939|±  |0.0219|
|  - high_school_government_and_politics|      1|none  |     0|acc   |↑  |0.9585|±  |0.0144|
|  - high_school_macroeconomics         |      1|none  |     0|acc   |↑  |0.8103|±  |0.0199|
|  - high_school_microeconomics         |      1|none  |     0|acc   |↑  |0.9076|±  |0.0188|
|  - high_school_psychology             |      1|none  |     0|acc   |↑  |0.9358|±  |0.0105|
|  - human_sexuality                    |      1|none  |     0|acc   |↑  |0.8855|±  |0.0279|
|  - professional_psychology            |      1|none  |     0|acc   |↑  |0.8578|±  |0.0141|
|  - public_relations                   |      1|none  |     0|acc   |↑  |0.7909|±  |0.0390|
|  - security_studies                   |      1|none  |     0|acc   |↑  |0.8327|±  |0.0239|
|  - sociology                          |      1|none  |     0|acc   |↑  |0.9104|±  |0.0202|
|  - us_foreign_policy                  |      1|none  |     0|acc   |↑  |0.9400|±  |0.0239|
| - stem                                |      2|none  |      |acc   |↑  |0.7520|±  |0.0073|
|  - abstract_algebra                   |      1|none  |     0|acc   |↑  |0.4500|±  |0.0500|
|  - anatomy                            |      1|none  |     0|acc   |↑  |0.8296|±  |0.0325|
|  - astronomy                          |      1|none  |     0|acc   |↑  |0.9211|±  |0.0219|
|  - college_biology                    |      1|none  |     0|acc   |↑  |0.9444|±  |0.0192|
|  - college_chemistry                  |      1|none  |     0|acc   |↑  |0.5600|±  |0.0499|
|  - college_computer_science           |      1|none  |     0|acc   |↑  |0.7100|±  |0.0456|
|  - college_mathematics                |      1|none  |     0|acc   |↑  |0.6200|±  |0.0488|
|  - college_physics                    |      1|none  |     0|acc   |↑  |0.6569|±  |0.0472|
|  - computer_security                  |      1|none  |     0|acc   |↑  |0.8300|±  |0.0378|
|  - conceptual_physics                 |      1|none  |     0|acc   |↑  |0.8213|±  |0.0250|
|  - electrical_engineering             |      1|none  |     0|acc   |↑  |0.7862|±  |0.0342|
|  - elementary_mathematics             |      1|none  |     0|acc   |↑  |0.7804|±  |0.0213|
|  - high_school_biology                |      1|none  |     0|acc   |↑  |0.9290|±  |0.0146|
|  - high_school_chemistry              |      1|none  |     0|acc   |↑  |0.7488|±  |0.0305|
|  - high_school_computer_science       |      1|none  |     0|acc   |↑  |0.8900|±  |0.0314|
|  - high_school_mathematics            |      1|none  |     0|acc   |↑  |0.5222|±  |0.0305|
|  - high_school_physics                |      1|none  |     0|acc   |↑  |0.6225|±  |0.0396|
|  - high_school_statistics             |      1|none  |     0|acc   |↑  |0.7500|±  |0.0295|
|  - machine_learning                   |      1|none  |     0|acc   |↑  |0.6339|±  |0.0457|

|      Groups      |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|------------------|------:|------|------|------|---|-----:|---|-----:|
|mmlu              |      2|none  |      |acc   |↑  |0.7725|±  |0.0033|
| - humanities     |      2|none  |      |acc   |↑  |0.6793|±  |0.0062|
| - other          |      2|none  |      |acc   |↑  |0.8339|±  |0.0064|
| - social sciences|      2|none  |      |acc   |↑  |0.8739|±  |0.0059|
| - stem           |      2|none  |      |acc   |↑  |0.7520|±  |0.0073|