XiaomiMiMo
/

MiMo-7B-RL-0530

@@ -11,11 +11,11 @@ library_name: transformers
 <h3 align="center">
   <b>
-    <span>━━━━━━━━━━━━━━━━━━━━━━━━━</span>
     <br/>
     Unlocking the Reasoning Potential of Language Model<br/>From Pretraining to Posttraining
     <br/>
-    <span>━━━━━━━━━━━━━━━━━━━━━━━━━</span>
     <br/>
   </b>
 </h3>
@@ -24,9 +24,9 @@ library_name: transformers
 <div align="center" style="line-height: 1;">
   |
-  <a href="https://huggingface.co/collections/XiaomiMiMo/mimo-6811688ee20ba7d0682f5cb9" target="_blank">🤗 HuggingFace</a>
   &nbsp;|
-  <a href="https://www.modelscope.cn/collections/MiMo-7edb0ab729c744" target="_blank">🤖️ ModelScope</a>
   &nbsp;|
   <a href="https://arxiv.org/abs/2505.07608" target="_blank">📔 Technical Report</a>
   &nbsp;|
@@ -39,7 +39,7 @@ library_name: transformers
 ## Updates
-[2025.05.30] During the RL training, by continuously expanding the training window size (from 32K to 48K), the performance of MiMo-7B-RL-0530 on AIME24 can be continuously improved and eventually surpass that of DeepSeek R1.
 <table>
   <thead>
@@ -53,7 +53,7 @@ library_name: transformers
     <tr>
       <td colspan="3"><strong>Mathematics</strong></td>
       <p align="center">
-        <td rowspan="11"><img width="80%" src="https://github.com/XiaomiMiMo/MiMo-test/raw/main/figures/length.jpg?raw=true"></td>
       </p>
     </tr>
     <tr><td>MATH500<br/>(Pass@1)</td><td>95.8</td><td>97.2</td></tr>
@@ -108,8 +108,7 @@ The MTP layers of MiMo-7B is tuned during pretraining and SFT and freezed during
   <img width="80%" src="https://github.com/XiaomiMiMo/MiMo/raw/main/figures/architecture.png?raw=true">
 </p>
-> Models are available at [Huggingface Collections: MiMo](https://huggingface.co/collections/XiaomiMiMo/mimo-6811688ee20ba7d0682f5cb9) and [ModelScope Collections: MiMo](https://www.modelscope.cn/collections/MiMo-7edb0ab729c744)
 |    **Model**    |                                **Description**                                |                            **Download (HuggingFace)**                             |                                  **Download (ModelScope)**                                  |
 | :-------------: | :---------------------------------------------------------------------------: | :-------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------: |
@@ -117,7 +116,6 @@ The MTP layers of MiMo-7B is tuned during pretraining and SFT and freezed during
 | MiMo-7B-RL-Zero |                       RL model trained from base model                        | [🤗 XiaomiMiMo/MiMo-7B-RL-Zero](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL-Zero) | [🤖️ XiaomiMiMo/MiMo-7B-RL-Zero](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-RL-Zero) |
 |   MiMo-7B-SFT   |                       SFT model trained from base model                       |     [🤗 XiaomiMiMo/MiMo-7B-SFT](https://huggingface.co/XiaomiMiMo/MiMo-7B-SFT)     |     [🤖️ XiaomiMiMo/MiMo-7B-SFT](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-SFT)     |
 |   MiMo-7B-RL    | RL model trained from SFT model, superior performance matching OpenAI o1-mini |      [🤗 XiaomiMiMo/MiMo-7B-RL](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL)      |      [🤖️ XiaomiMiMo/MiMo-7B-RL](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-RL)      |
-| MiMo-7B-RL-0530 |                    Advanced RL model with extended length                     | [🤗 XiaomiMiMo/MiMo-7B-RL-0530](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL-0530) | [🤖️ XiaomiMiMo/MiMo-7B-RL-0530](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-RL-0530) |
 ## III. Evaluation Results
@@ -139,15 +137,15 @@ The MTP layers of MiMo-7B is tuned during pretraining and SFT and freezed during
 MiMo-7B series
-| Benchmark                     | MiMo-7B-Base | MiMo-7B-RL-Zero | MiMo-7B-SFT | MiMo-7B-RL | MiMo-7B-RL-0530 |
-| ----------------------------- | :----------: | :-------------: | :---------: | :--------: | :-------------: |
-| **Mathematics**               |              |                 |             |            |                 |
-| MATH500<br/>(Pass@1)          |     37.4     |      93.6       |    93.0     |    95.8    |      97.2       |
-| AIME 2024<br/>(Pass@1)        |     32.9     |      56.4       |    58.7     |    68.2    |      80.1       |
-| AIME 2025<br/>(Pass@1)        |     24.3     |      46.3       |    44.3     |    55.4    |      70.2       |
-| **Code**                      |              |                 |             |            |                 |
-| LiveCodeBench v5<br/>(Pass@1) |     32.9     |      49.1       |    52.3     |    57.8    |      60.9       |
-| LiveCodeBench v6<br/>(Pass@1) |     29.1     |      42.9       |    45.5     |    49.3    |      52.2       |
 > [!IMPORTANT]
 > The evaluations are conducted with `temperature=0.6`.
@@ -158,7 +156,7 @@ MiMo-7B series
 ### SGLang Inference
-Thanks to the [contribution](https://github.com/sgl-project/sglang/pull/5921) from the SGLang team, we supported MiMo in SGLang mainstream within 24h with MTP coming soon.
 Example Script
@@ -168,9 +166,14 @@ python3 -m uv pip install "sglang[all] @ git+https://github.com/sgl-project/sgla
 # Launch SGLang Server
 python3 -m sglang.launch_server --model-path XiaomiMiMo/MiMo-7B-RL --host 0.0.0.0 --trust-remote-code
 ```
-Detailed usage can be found in [SGLang documents](https://docs.sglang.ai/backend/send_request.html). MTP will also be supported in 24h.
 ### vLLM inference
@@ -259,7 +262,7 @@ print(tokenizer.decode(output.tolist()[0]))
 ```bibtex
 @misc{coreteam2025mimounlockingreasoningpotential,
       title={MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining},
-      author={{Xiaomi LLM-Core Team}},
       year={2025},
       eprint={2505.07608},
       archivePrefix={arXiv},

 <h3 align="center">
   <b>
+    <span>━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━</span>
     <br/>
     Unlocking the Reasoning Potential of Language Model<br/>From Pretraining to Posttraining
     <br/>
+    <span>━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━</span>
     <br/>
   </b>
 </h3>
 <div align="center" style="line-height: 1;">
   |
+  <a href="https://huggingface.co/XiaomiMiMo" target="_blank">🤗 HuggingFace</a>
   &nbsp;|
+  <a href="https://www.modelscope.cn/organization/XiaomiMiMo" target="_blank">🤖️ ModelScope</a>
   &nbsp;|
   <a href="https://arxiv.org/abs/2505.07608" target="_blank">📔 Technical Report</a>
   &nbsp;|
 ## Updates
+[2025.05.30] We scaled the SFT dataset from approximately 500K to 6M instances and continuously expanding the RL training window size from 32K to 48K, the performance of [MiMo-7B-RL-0530](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL-0530) on AIME24 can be continuously improved and eventually surpass that of DeepSeek R1 (79.8).
 <table>
   <thead>
     <tr>
       <td colspan="3"><strong>Mathematics</strong></td>
       <p align="center">
+        <td rowspan="11"><img width="80%" src="https://github.com/XiaomiMiMo/MiMo/raw/main/figures/length.jpg?raw=true"></td>
       </p>
     </tr>
     <tr><td>MATH500<br/>(Pass@1)</td><td>95.8</td><td>97.2</td></tr>
   <img width="80%" src="https://github.com/XiaomiMiMo/MiMo/raw/main/figures/architecture.png?raw=true">
 </p>
+> Models are available at [https://huggingface.co/XiaomiMiMo](https://huggingface.co/XiaomiMiMo) and [https://www.modelscope.cn/organization/XiaomiMiMo](https://www.modelscope.cn/organization/XiaomiMiMo)
 |    **Model**    |                                **Description**                                |                            **Download (HuggingFace)**                             |                                  **Download (ModelScope)**                                  |
 | :-------------: | :---------------------------------------------------------------------------: | :-------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------: |
 | MiMo-7B-RL-Zero |                       RL model trained from base model                        | [🤗 XiaomiMiMo/MiMo-7B-RL-Zero](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL-Zero) | [🤖️ XiaomiMiMo/MiMo-7B-RL-Zero](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-RL-Zero) |
 |   MiMo-7B-SFT   |                       SFT model trained from base model                       |     [🤗 XiaomiMiMo/MiMo-7B-SFT](https://huggingface.co/XiaomiMiMo/MiMo-7B-SFT)     |     [🤖️ XiaomiMiMo/MiMo-7B-SFT](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-SFT)     |
 |   MiMo-7B-RL    | RL model trained from SFT model, superior performance matching OpenAI o1-mini |      [🤗 XiaomiMiMo/MiMo-7B-RL](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL)      |      [🤖️ XiaomiMiMo/MiMo-7B-RL](https://www.modelscope.cn/models/XiaomiMiMo/MiMo-7B-RL)      |
 ## III. Evaluation Results
 MiMo-7B series
+| Benchmark                     | MiMo-7B-Base | MiMo-7B-RL-Zero | MiMo-7B-SFT | MiMo-7B-RL |
+| ----------------------------- | :----------: | :-------------: | :---------: | :--------: |
+| **Mathematics**               |              |                 |             |            |
+| MATH500<br/>(Pass@1)          |     37.4     |      93.6       |    93.0     |    95.8    |
+| AIME 2024<br/>(Pass@1)        |     32.9     |      56.4       |    58.7     |    68.2    |
+| AIME 2025<br/>(Pass@1)        |     24.3     |      46.3       |    44.3     |    55.4    |
+| **Code**                      |              |                 |             |            |
+| LiveCodeBench v5<br/>(Pass@1) |     32.9     |      49.1       |    52.3     |    57.8    |
+| LiveCodeBench v6<br/>(Pass@1) |     29.1     |      42.9       |    45.5     |    49.3    |
 > [!IMPORTANT]
 > The evaluations are conducted with `temperature=0.6`.
 ### SGLang Inference
+Thanks to the [MiMo model support](https://github.com/sgl-project/sglang/pull/5921) and [MTP](https://github.com/sgl-project/sglang/pull/6059) from the SGLang team, we supported MiMo in SGLang mainstream.
 Example Script
 # Launch SGLang Server
 python3 -m sglang.launch_server --model-path XiaomiMiMo/MiMo-7B-RL --host 0.0.0.0 --trust-remote-code
+# Launch MTP Server
+python3 -m sglang.launch_server --model-path XiaomiMiMo/MiMo-7B-RL --trust-remote-code \
+--speculative-algorithm EAGLE --speculative-num-steps 1 --speculative-eagle-topk 1 \
+--speculative-num-draft-tokens 2  --mem-fraction 0.5
 ```
+Detailed usage can be found in [SGLang documents](https://docs.sglang.ai/backend/send_request.html).
 ### vLLM inference
 ```bibtex
 @misc{coreteam2025mimounlockingreasoningpotential,
       title={MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining},
+      author={LLM-Core-Team Xiaomi},
       year={2025},
       eprint={2505.07608},
       archivePrefix={arXiv},