bwshen-mi commited on
Commit
3234005
Β·
verified Β·
1 Parent(s): 2ab7cb1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -21
README.md CHANGED
@@ -11,11 +11,11 @@ library_name: transformers
11
 
12
  <h3 align="center">
13
  <b>
14
- <span>━━━━━━━━━━━━━━━━━━━━━━━━━</span>
15
  <br/>
16
  Unlocking the Reasoning Potential of Language Model<br/>From Pretraining to Posttraining
17
  <br/>
18
- <span>━━━━━━━━━━━━━━━━━━━━━━━━━</span>
19
  <br/>
20
  </b>
21
  </h3>
@@ -24,9 +24,9 @@ library_name: transformers
24
 
25
  <div align="center" style="line-height: 1;">
26
  |
27
- <a href="https://huggingface.co/collections/XiaomiMiMo/mimo-6811688ee20ba7d0682f5cb9" target="_blank">πŸ€— HuggingFace</a>
28
  &nbsp;|
29
- <a href="https://www.modelscope.cn/collections/MiMo-7edb0ab729c744" target="_blank">πŸ€–οΈ ModelScope</a>
30
  &nbsp;|
31
  <a href="https://arxiv.org/abs/2505.07608" target="_blank">πŸ“” Technical Report</a>
32
  &nbsp;|
@@ -39,7 +39,7 @@ library_name: transformers
39
 
40
  ## Updates
41
 
42
- [2025.05.30] During the RL training, by continuously expanding the training window size (from 32K to 48K), the performance of MiMo-7B-RL-0530 on AIME24 can be continuously improved and eventually surpass that of DeepSeek R1.
43
 
44
  <table>
45
  <thead>
@@ -53,7 +53,7 @@ library_name: transformers
53
  <tr>
54
  <td colspan="3"><strong>Mathematics</strong></td>
55
  <p align="center">
56
- <td rowspan="11"><img width="80%" src="https://github.com/XiaomiMiMo/MiMo-test/raw/main/figures/length.jpg?raw=true"></td>
57
  </p>
58
  </tr>
59
  <tr><td>MATH500<br/>(Pass@1)</td><td>95.8</td><td>97.2</td></tr>
@@ -108,8 +108,7 @@ The MTP layers of MiMo-7B is tuned during pretraining and SFT and freezed during
108
  <img width="80%" src="https://github.com/XiaomiMiMo/MiMo/raw/main/figures/architecture.png?raw=true">
109
  </p>
110
 
111
- > Models are available at [Huggingface Collections: MiMo](https://huggingface.co/collections/XiaomiMiMo/mimo-6811688ee20ba7d0682f5cb9) and [ModelScope Collections: MiMo](https://www.modelscope.cn/collections/MiMo-7edb0ab729c744)
112
-
113
 
114
  | **Model** | **Description** | **Download (HuggingFace)** | **Download (ModelScope)** |
115
  | :-------------: | :---------------------------------------------------------------------------: | :-------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------: |
@@ -117,7 +116,6 @@ The MTP layers of MiMo-7B is tuned during pretraining and SFT and freezed during
117
  | MiMo-7B-RL-Zero | RL model trained from base model | [πŸ€— XiaomiMiMo/MiMo-7B-RL-Zero](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL-Zero) | [πŸ€–οΈ XiaomiMiMo/MiMo-7B-RL-Zero](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-RL-Zero) |
118
  | MiMo-7B-SFT | SFT model trained from base model | [πŸ€— XiaomiMiMo/MiMo-7B-SFT](https://huggingface.co/XiaomiMiMo/MiMo-7B-SFT) | [πŸ€–οΈ XiaomiMiMo/MiMo-7B-SFT](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-SFT) |
119
  | MiMo-7B-RL | RL model trained from SFT model, superior performance matching OpenAI o1-mini | [πŸ€— XiaomiMiMo/MiMo-7B-RL](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL) | [πŸ€–οΈ XiaomiMiMo/MiMo-7B-RL](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-RL) |
120
- | MiMo-7B-RL-0530 | Advanced RL model with extended length | [πŸ€— XiaomiMiMo/MiMo-7B-RL-0530](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL-0530) | [πŸ€–οΈ XiaomiMiMo/MiMo-7B-RL-0530](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-RL-0530) |
121
 
122
  ## III. Evaluation Results
123
 
@@ -139,15 +137,15 @@ The MTP layers of MiMo-7B is tuned during pretraining and SFT and freezed during
139
 
140
  MiMo-7B series
141
 
142
- | Benchmark | MiMo-7B-Base | MiMo-7B-RL-Zero | MiMo-7B-SFT | MiMo-7B-RL | MiMo-7B-RL-0530 |
143
- | ----------------------------- | :----------: | :-------------: | :---------: | :--------: | :-------------: |
144
- | **Mathematics** | | | | | |
145
- | MATH500<br/>(Pass@1) | 37.4 | 93.6 | 93.0 | 95.8 | 97.2 |
146
- | AIME 2024<br/>(Pass@1) | 32.9 | 56.4 | 58.7 | 68.2 | 80.1 |
147
- | AIME 2025<br/>(Pass@1) | 24.3 | 46.3 | 44.3 | 55.4 | 70.2 |
148
- | **Code** | | | | | |
149
- | LiveCodeBench v5<br/>(Pass@1) | 32.9 | 49.1 | 52.3 | 57.8 | 60.9 |
150
- | LiveCodeBench v6<br/>(Pass@1) | 29.1 | 42.9 | 45.5 | 49.3 | 52.2 |
151
 
152
  > [!IMPORTANT]
153
  > The evaluations are conducted with `temperature=0.6`.
@@ -158,7 +156,7 @@ MiMo-7B series
158
 
159
  ### SGLang Inference
160
 
161
- Thanks to the [contribution](https://github.com/sgl-project/sglang/pull/5921) from the SGLang team, we supported MiMo in SGLang mainstream within 24h with MTP coming soon.
162
 
163
  Example Script
164
 
@@ -168,9 +166,14 @@ python3 -m uv pip install "sglang[all] @ git+https://github.com/sgl-project/sgla
168
 
169
  # Launch SGLang Server
170
  python3 -m sglang.launch_server --model-path XiaomiMiMo/MiMo-7B-RL --host 0.0.0.0 --trust-remote-code
 
 
 
 
 
171
  ```
172
 
173
- Detailed usage can be found in [SGLang documents](https://docs.sglang.ai/backend/send_request.html). MTP will also be supported in 24h.
174
 
175
  ### vLLM inference
176
 
@@ -259,7 +262,7 @@ print(tokenizer.decode(output.tolist()[0]))
259
  ```bibtex
260
  @misc{coreteam2025mimounlockingreasoningpotential,
261
  title={MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining},
262
- author={{Xiaomi LLM-Core Team}},
263
  year={2025},
264
  eprint={2505.07608},
265
  archivePrefix={arXiv},
 
11
 
12
  <h3 align="center">
13
  <b>
14
+ <span>━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━</span>
15
  <br/>
16
  Unlocking the Reasoning Potential of Language Model<br/>From Pretraining to Posttraining
17
  <br/>
18
+ <span>━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━</span>
19
  <br/>
20
  </b>
21
  </h3>
 
24
 
25
  <div align="center" style="line-height: 1;">
26
  |
27
+ <a href="https://huggingface.co/XiaomiMiMo" target="_blank">πŸ€— HuggingFace</a>
28
  &nbsp;|
29
+ <a href="https://www.modelscope.cn/organization/XiaomiMiMo" target="_blank">πŸ€–οΈ ModelScope</a>
30
  &nbsp;|
31
  <a href="https://arxiv.org/abs/2505.07608" target="_blank">πŸ“” Technical Report</a>
32
  &nbsp;|
 
39
 
40
  ## Updates
41
 
42
+ [2025.05.30] We scaled the SFT dataset from approximately 500K to 6M instances and continuously expanding the RL training window size from 32K to 48K, the performance of [MiMo-7B-RL-0530](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL-0530) on AIME24 can be continuously improved and eventually surpass that of DeepSeek R1 (79.8).
43
 
44
  <table>
45
  <thead>
 
53
  <tr>
54
  <td colspan="3"><strong>Mathematics</strong></td>
55
  <p align="center">
56
+ <td rowspan="11"><img width="80%" src="https://github.com/XiaomiMiMo/MiMo/raw/main/figures/length.jpg?raw=true"></td>
57
  </p>
58
  </tr>
59
  <tr><td>MATH500<br/>(Pass@1)</td><td>95.8</td><td>97.2</td></tr>
 
108
  <img width="80%" src="https://github.com/XiaomiMiMo/MiMo/raw/main/figures/architecture.png?raw=true">
109
  </p>
110
 
111
+ > Models are available at [https://huggingface.co/XiaomiMiMo](https://huggingface.co/XiaomiMiMo) and [https://www.modelscope.cn/organization/XiaomiMiMo](https://www.modelscope.cn/organization/XiaomiMiMo)
 
112
 
113
  | **Model** | **Description** | **Download (HuggingFace)** | **Download (ModelScope)** |
114
  | :-------------: | :---------------------------------------------------------------------------: | :-------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------: |
 
116
  | MiMo-7B-RL-Zero | RL model trained from base model | [πŸ€— XiaomiMiMo/MiMo-7B-RL-Zero](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL-Zero) | [πŸ€–οΈ XiaomiMiMo/MiMo-7B-RL-Zero](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-RL-Zero) |
117
  | MiMo-7B-SFT | SFT model trained from base model | [πŸ€— XiaomiMiMo/MiMo-7B-SFT](https://huggingface.co/XiaomiMiMo/MiMo-7B-SFT) | [πŸ€–οΈ XiaomiMiMo/MiMo-7B-SFT](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-SFT) |
118
  | MiMo-7B-RL | RL model trained from SFT model, superior performance matching OpenAI o1-mini | [πŸ€— XiaomiMiMo/MiMo-7B-RL](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL) | [πŸ€–οΈ XiaomiMiMo/MiMo-7B-RL](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-RL) |
 
119
 
120
  ## III. Evaluation Results
121
 
 
137
 
138
  MiMo-7B series
139
 
140
+ | Benchmark | MiMo-7B-Base | MiMo-7B-RL-Zero | MiMo-7B-SFT | MiMo-7B-RL |
141
+ | ----------------------------- | :----------: | :-------------: | :---------: | :--------: |
142
+ | **Mathematics** | | | | |
143
+ | MATH500<br/>(Pass@1) | 37.4 | 93.6 | 93.0 | 95.8 |
144
+ | AIME 2024<br/>(Pass@1) | 32.9 | 56.4 | 58.7 | 68.2 |
145
+ | AIME 2025<br/>(Pass@1) | 24.3 | 46.3 | 44.3 | 55.4 |
146
+ | **Code** | | | | |
147
+ | LiveCodeBench v5<br/>(Pass@1) | 32.9 | 49.1 | 52.3 | 57.8 |
148
+ | LiveCodeBench v6<br/>(Pass@1) | 29.1 | 42.9 | 45.5 | 49.3 |
149
 
150
  > [!IMPORTANT]
151
  > The evaluations are conducted with `temperature=0.6`.
 
156
 
157
  ### SGLang Inference
158
 
159
+ Thanks to the [MiMo model support](https://github.com/sgl-project/sglang/pull/5921) and [MTP](https://github.com/sgl-project/sglang/pull/6059) from the SGLang team, we supported MiMo in SGLang mainstream.
160
 
161
  Example Script
162
 
 
166
 
167
  # Launch SGLang Server
168
  python3 -m sglang.launch_server --model-path XiaomiMiMo/MiMo-7B-RL --host 0.0.0.0 --trust-remote-code
169
+
170
+ # Launch MTP Server
171
+ python3 -m sglang.launch_server --model-path XiaomiMiMo/MiMo-7B-RL --trust-remote-code \
172
+ --speculative-algorithm EAGLE --speculative-num-steps 1 --speculative-eagle-topk 1 \
173
+ --speculative-num-draft-tokens 2 --mem-fraction 0.5
174
  ```
175
 
176
+ Detailed usage can be found in [SGLang documents](https://docs.sglang.ai/backend/send_request.html).
177
 
178
  ### vLLM inference
179
 
 
262
  ```bibtex
263
  @misc{coreteam2025mimounlockingreasoningpotential,
264
  title={MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining},
265
+ author={LLM-Core-Team Xiaomi},
266
  year={2025},
267
  eprint={2505.07608},
268
  archivePrefix={arXiv},