Update README.md
Browse files
README.md
CHANGED
@@ -11,11 +11,11 @@ library_name: transformers
|
|
11 |
|
12 |
<h3 align="center">
|
13 |
<b>
|
14 |
-
<span
|
15 |
<br/>
|
16 |
Unlocking the Reasoning Potential of Language Model<br/>From Pretraining to Posttraining
|
17 |
<br/>
|
18 |
-
<span
|
19 |
<br/>
|
20 |
</b>
|
21 |
</h3>
|
@@ -24,9 +24,9 @@ library_name: transformers
|
|
24 |
|
25 |
<div align="center" style="line-height: 1;">
|
26 |
|
|
27 |
-
<a href="https://huggingface.co/
|
28 |
|
|
29 |
-
<a href="https://www.modelscope.cn/
|
30 |
|
|
31 |
<a href="https://arxiv.org/abs/2505.07608" target="_blank">π Technical Report</a>
|
32 |
|
|
@@ -39,7 +39,7 @@ library_name: transformers
|
|
39 |
|
40 |
## Updates
|
41 |
|
42 |
-
[2025.05.30]
|
43 |
|
44 |
<table>
|
45 |
<thead>
|
@@ -53,7 +53,7 @@ library_name: transformers
|
|
53 |
<tr>
|
54 |
<td colspan="3"><strong>Mathematics</strong></td>
|
55 |
<p align="center">
|
56 |
-
<td rowspan="11"><img width="80%" src="https://github.com/XiaomiMiMo/MiMo
|
57 |
</p>
|
58 |
</tr>
|
59 |
<tr><td>MATH500<br/>(Pass@1)</td><td>95.8</td><td>97.2</td></tr>
|
@@ -108,8 +108,7 @@ The MTP layers of MiMo-7B is tuned during pretraining and SFT and freezed during
|
|
108 |
<img width="80%" src="https://github.com/XiaomiMiMo/MiMo/raw/main/figures/architecture.png?raw=true">
|
109 |
</p>
|
110 |
|
111 |
-
> Models are available at [
|
112 |
-
|
113 |
|
114 |
| **Model** | **Description** | **Download (HuggingFace)** | **Download (ModelScope)** |
|
115 |
| :-------------: | :---------------------------------------------------------------------------: | :-------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------: |
|
@@ -117,7 +116,6 @@ The MTP layers of MiMo-7B is tuned during pretraining and SFT and freezed during
|
|
117 |
| MiMo-7B-RL-Zero | RL model trained from base model | [π€ XiaomiMiMo/MiMo-7B-RL-Zero](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL-Zero) | [π€οΈ XiaomiMiMo/MiMo-7B-RL-Zero](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-RL-Zero) |
|
118 |
| MiMo-7B-SFT | SFT model trained from base model | [π€ XiaomiMiMo/MiMo-7B-SFT](https://huggingface.co/XiaomiMiMo/MiMo-7B-SFT) | [π€οΈ XiaomiMiMo/MiMo-7B-SFT](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-SFT) |
|
119 |
| MiMo-7B-RL | RL model trained from SFT model, superior performance matching OpenAI o1-mini | [π€ XiaomiMiMo/MiMo-7B-RL](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL) | [π€οΈ XiaomiMiMo/MiMo-7B-RL](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-RL) |
|
120 |
-
| MiMo-7B-RL-0530 | Advanced RL model with extended length | [π€ XiaomiMiMo/MiMo-7B-RL-0530](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL-0530) | [π€οΈ XiaomiMiMo/MiMo-7B-RL-0530](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-RL-0530) |
|
121 |
|
122 |
## III. Evaluation Results
|
123 |
|
@@ -139,15 +137,15 @@ The MTP layers of MiMo-7B is tuned during pretraining and SFT and freezed during
|
|
139 |
|
140 |
MiMo-7B series
|
141 |
|
142 |
-
| Benchmark | MiMo-7B-Base | MiMo-7B-RL-Zero | MiMo-7B-SFT | MiMo-7B-RL |
|
143 |
-
| ----------------------------- | :----------: | :-------------: | :---------: | :--------: |
|
144 |
-
| **Mathematics** | | | | |
|
145 |
-
| MATH500<br/>(Pass@1) | 37.4 | 93.6 | 93.0 | 95.8 |
|
146 |
-
| AIME 2024<br/>(Pass@1) | 32.9 | 56.4 | 58.7 | 68.2 |
|
147 |
-
| AIME 2025<br/>(Pass@1) | 24.3 | 46.3 | 44.3 | 55.4 |
|
148 |
-
| **Code** | | | | |
|
149 |
-
| LiveCodeBench v5<br/>(Pass@1) | 32.9 | 49.1 | 52.3 | 57.8 |
|
150 |
-
| LiveCodeBench v6<br/>(Pass@1) | 29.1 | 42.9 | 45.5 | 49.3 |
|
151 |
|
152 |
> [!IMPORTANT]
|
153 |
> The evaluations are conducted with `temperature=0.6`.
|
@@ -158,7 +156,7 @@ MiMo-7B series
|
|
158 |
|
159 |
### SGLang Inference
|
160 |
|
161 |
-
Thanks to the [
|
162 |
|
163 |
Example Script
|
164 |
|
@@ -168,9 +166,14 @@ python3 -m uv pip install "sglang[all] @ git+https://github.com/sgl-project/sgla
|
|
168 |
|
169 |
# Launch SGLang Server
|
170 |
python3 -m sglang.launch_server --model-path XiaomiMiMo/MiMo-7B-RL --host 0.0.0.0 --trust-remote-code
|
|
|
|
|
|
|
|
|
|
|
171 |
```
|
172 |
|
173 |
-
Detailed usage can be found in [SGLang documents](https://docs.sglang.ai/backend/send_request.html).
|
174 |
|
175 |
### vLLM inference
|
176 |
|
@@ -259,7 +262,7 @@ print(tokenizer.decode(output.tolist()[0]))
|
|
259 |
```bibtex
|
260 |
@misc{coreteam2025mimounlockingreasoningpotential,
|
261 |
title={MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining},
|
262 |
-
author={
|
263 |
year={2025},
|
264 |
eprint={2505.07608},
|
265 |
archivePrefix={arXiv},
|
|
|
11 |
|
12 |
<h3 align="center">
|
13 |
<b>
|
14 |
+
<span>βββββββββββββββββββββββββββββββββββββββββ</span>
|
15 |
<br/>
|
16 |
Unlocking the Reasoning Potential of Language Model<br/>From Pretraining to Posttraining
|
17 |
<br/>
|
18 |
+
<span>βββββββββββββββββββββββββββββββββββββββββ</span>
|
19 |
<br/>
|
20 |
</b>
|
21 |
</h3>
|
|
|
24 |
|
25 |
<div align="center" style="line-height: 1;">
|
26 |
|
|
27 |
+
<a href="https://huggingface.co/XiaomiMiMo" target="_blank">π€ HuggingFace</a>
|
28 |
|
|
29 |
+
<a href="https://www.modelscope.cn/organization/XiaomiMiMo" target="_blank">π€οΈ ModelScope</a>
|
30 |
|
|
31 |
<a href="https://arxiv.org/abs/2505.07608" target="_blank">π Technical Report</a>
|
32 |
|
|
|
|
39 |
|
40 |
## Updates
|
41 |
|
42 |
+
[2025.05.30] We scaled the SFT dataset from approximately 500K to 6M instances and continuously expanding the RL training window size from 32K to 48K, the performance of [MiMo-7B-RL-0530](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL-0530) on AIME24 can be continuously improved and eventually surpass that of DeepSeek R1 (79.8).
|
43 |
|
44 |
<table>
|
45 |
<thead>
|
|
|
53 |
<tr>
|
54 |
<td colspan="3"><strong>Mathematics</strong></td>
|
55 |
<p align="center">
|
56 |
+
<td rowspan="11"><img width="80%" src="https://github.com/XiaomiMiMo/MiMo/raw/main/figures/length.jpg?raw=true"></td>
|
57 |
</p>
|
58 |
</tr>
|
59 |
<tr><td>MATH500<br/>(Pass@1)</td><td>95.8</td><td>97.2</td></tr>
|
|
|
108 |
<img width="80%" src="https://github.com/XiaomiMiMo/MiMo/raw/main/figures/architecture.png?raw=true">
|
109 |
</p>
|
110 |
|
111 |
+
> Models are available at [https://huggingface.co/XiaomiMiMo](https://huggingface.co/XiaomiMiMo) and [https://www.modelscope.cn/organization/XiaomiMiMo](https://www.modelscope.cn/organization/XiaomiMiMo)
|
|
|
112 |
|
113 |
| **Model** | **Description** | **Download (HuggingFace)** | **Download (ModelScope)** |
|
114 |
| :-------------: | :---------------------------------------------------------------------------: | :-------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------: |
|
|
|
116 |
| MiMo-7B-RL-Zero | RL model trained from base model | [π€ XiaomiMiMo/MiMo-7B-RL-Zero](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL-Zero) | [π€οΈ XiaomiMiMo/MiMo-7B-RL-Zero](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-RL-Zero) |
|
117 |
| MiMo-7B-SFT | SFT model trained from base model | [π€ XiaomiMiMo/MiMo-7B-SFT](https://huggingface.co/XiaomiMiMo/MiMo-7B-SFT) | [π€οΈ XiaomiMiMo/MiMo-7B-SFT](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-SFT) |
|
118 |
| MiMo-7B-RL | RL model trained from SFT model, superior performance matching OpenAI o1-mini | [π€ XiaomiMiMo/MiMo-7B-RL](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL) | [π€οΈ XiaomiMiMo/MiMo-7B-RL](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-RL) |
|
|
|
119 |
|
120 |
## III. Evaluation Results
|
121 |
|
|
|
137 |
|
138 |
MiMo-7B series
|
139 |
|
140 |
+
| Benchmark | MiMo-7B-Base | MiMo-7B-RL-Zero | MiMo-7B-SFT | MiMo-7B-RL |
|
141 |
+
| ----------------------------- | :----------: | :-------------: | :---------: | :--------: |
|
142 |
+
| **Mathematics** | | | | |
|
143 |
+
| MATH500<br/>(Pass@1) | 37.4 | 93.6 | 93.0 | 95.8 |
|
144 |
+
| AIME 2024<br/>(Pass@1) | 32.9 | 56.4 | 58.7 | 68.2 |
|
145 |
+
| AIME 2025<br/>(Pass@1) | 24.3 | 46.3 | 44.3 | 55.4 |
|
146 |
+
| **Code** | | | | |
|
147 |
+
| LiveCodeBench v5<br/>(Pass@1) | 32.9 | 49.1 | 52.3 | 57.8 |
|
148 |
+
| LiveCodeBench v6<br/>(Pass@1) | 29.1 | 42.9 | 45.5 | 49.3 |
|
149 |
|
150 |
> [!IMPORTANT]
|
151 |
> The evaluations are conducted with `temperature=0.6`.
|
|
|
156 |
|
157 |
### SGLang Inference
|
158 |
|
159 |
+
Thanks to the [MiMo model support](https://github.com/sgl-project/sglang/pull/5921) and [MTP](https://github.com/sgl-project/sglang/pull/6059) from the SGLang team, we supported MiMo in SGLang mainstream.
|
160 |
|
161 |
Example Script
|
162 |
|
|
|
166 |
|
167 |
# Launch SGLang Server
|
168 |
python3 -m sglang.launch_server --model-path XiaomiMiMo/MiMo-7B-RL --host 0.0.0.0 --trust-remote-code
|
169 |
+
|
170 |
+
# Launch MTP Server
|
171 |
+
python3 -m sglang.launch_server --model-path XiaomiMiMo/MiMo-7B-RL --trust-remote-code \
|
172 |
+
--speculative-algorithm EAGLE --speculative-num-steps 1 --speculative-eagle-topk 1 \
|
173 |
+
--speculative-num-draft-tokens 2 --mem-fraction 0.5
|
174 |
```
|
175 |
|
176 |
+
Detailed usage can be found in [SGLang documents](https://docs.sglang.ai/backend/send_request.html).
|
177 |
|
178 |
### vLLM inference
|
179 |
|
|
|
262 |
```bibtex
|
263 |
@misc{coreteam2025mimounlockingreasoningpotential,
|
264 |
title={MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining},
|
265 |
+
author={LLM-Core-Team Xiaomi},
|
266 |
year={2025},
|
267 |
eprint={2505.07608},
|
268 |
archivePrefix={arXiv},
|