nvidia
/

AceReason-Nemotron-7B

@@ -1,26 +1,24 @@
 ---
 library_name: transformers
 license: other
 license_name: nvidia-open-model-license
-license_link: >-
-  https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/
 pipeline_tag: text-generation
-language:
-  - en
 tags:
-  - nvidia
-  - reasoning
-  - math
-  - code
-  - reinforcement learning
-  - pytorch
 ---
 # AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning
 <p align="center">
 [![Technical Report](https://img.shields.io/badge/2505.16400-Technical_Report-blue)](https://arxiv.org/abs/2505.16400)
@@ -33,7 +31,7 @@ tags:
 ## 🔥News
 - **6/16/2025**: We are excited to share our new release combining SFT with RL: **AceReason-Nemotron-1.1-7B**
-  - Paper: https://arxiv.org/pdf/2506.13284
   - Model: https://huggingface.co/nvidia/AceReason-Nemotron-1.1-7B
   - 4M SFT Data: https://huggingface.co/datasets/nvidia/AceReason-1.1-SFT
 - **6/11/2025**: We share our evaluation toolkit at [AceReason Evalution](https://huggingface.co/nvidia/AceReason-Nemotron-14B/blob/main/README_EVALUATION.md) including:
@@ -68,10 +66,6 @@ We evaluate our model against competitive reasoning models of comparable size wi
 | [AceReason-Nemotron-7B 🤗](https://huggingface.co/nvidia/AceReason-Nemotron-7B)| 69.0 | 53.6 | 51.8 | 44.1 |
 | [AceReason-Nemotron-14B 🤗](https://huggingface.co/nvidia/AceReason-Nemotron-14B)| 78.6 | 67.4 | 61.1 | 54.9 |
 ## How to use
 ```python
 import torch
@@ -104,7 +98,6 @@ generated_ids = [
 response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
 ```
 ## Usage Recommendations
 1. Don't include a system prompt; instead, place all instructions directly in the user prompt.
@@ -114,15 +107,33 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
 question = "" # code question
 starter_code = "" # starter code function header
-code_instruction_nostartercode = """Write Python code to solve the problem. Please place the solution code in the following format:\n```python\n# Your solution code here\n```"""
-code_instruction_hasstartercode = """Please place the solution code in the following format:\n```python\n# Your solution code here\n```"""
 if starter_code != "":
-    question += "\n\n" + "Solve the problem starting with the provided function header.\n\nFunction header:\n" + "```\n" + starter_code + "\n```"
-    question += "\n\n" + code_instruction_hasstartercode
 else:
-    question += "\n\n" + code_instruction_nostartercode
-final_prompt = "<｜User｜>" + question + "<｜Assistant｜><think>\n"
 ```
 4. Our inference engine for evaluation is **vLLM==0.7.3** using top-p=0.95, temperature=0.6, max_tokens=32768.
@@ -130,15 +141,16 @@ final_prompt = "<｜User｜>" + question + "<｜Assistant｜><think>\n"
 Please check evaluation code, scripts, cached prediction files in https://huggingface.co/nvidia/AceReason-Nemotron-14B/blob/main/README_EVALUATION.md
 ## Correspondence to
 Yang Chen ([email protected]), Zhuolin Yang ([email protected]), Zihan Liu ([email protected]), Chankyu Lee ([email protected]), Wei Ping ([email protected])
 ## License
 Your use of this model is governed by the [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/).
 ## Citation
 ```
 @article{chen2025acereason,
@@ -147,5 +159,4 @@ Your use of this model is governed by the [NVIDIA Open Model License](https://ww
   journal={arXiv preprint arXiv:2505.16400},
   year={2025}
 }
-```

 ---
+language:
+- en
 library_name: transformers
 license: other
 license_name: nvidia-open-model-license
+license_link: https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/
 pipeline_tag: text-generation
 tags:
+- nvidia
+- reasoning
+- math
+- code
+- reinforcement learning
+- pytorch
 ---
 # AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning
+This repository contains the model for AceReason-Nemotron 1.1 as presented in [AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy](https://huggingface.co/papers/2506.13284).
 <p align="center">
 [![Technical Report](https://img.shields.io/badge/2505.16400-Technical_Report-blue)](https://arxiv.org/abs/2505.16400)
 ## 🔥News
 - **6/16/2025**: We are excited to share our new release combining SFT with RL: **AceReason-Nemotron-1.1-7B**
+  - Paper: https://huggingface.co/papers/2506.13284
   - Model: https://huggingface.co/nvidia/AceReason-Nemotron-1.1-7B
   - 4M SFT Data: https://huggingface.co/datasets/nvidia/AceReason-1.1-SFT
 - **6/11/2025**: We share our evaluation toolkit at [AceReason Evalution](https://huggingface.co/nvidia/AceReason-Nemotron-14B/blob/main/README_EVALUATION.md) including:
 | [AceReason-Nemotron-7B 🤗](https://huggingface.co/nvidia/AceReason-Nemotron-7B)| 69.0 | 53.6 | 51.8 | 44.1 |
 | [AceReason-Nemotron-14B 🤗](https://huggingface.co/nvidia/AceReason-Nemotron-14B)| 78.6 | 67.4 | 61.1 | 54.9 |
 ## How to use
 ```python
 import torch
 response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
 ```
 ## Usage Recommendations
 1. Don't include a system prompt; instead, place all instructions directly in the user prompt.
 question = "" # code question
 starter_code = "" # starter code function header
+code_instruction_nostartercode = """Write Python code to solve the problem. Please place the solution code in the following format:
+```python
+# Your solution code here
+```"""
+code_instruction_hasstartercode = """Please place the solution code in the following format:
+```python
+# Your solution code here
+```"""
 if starter_code != "":
+    question += "
+" + "Solve the problem starting with the provided function header.
+Function header:
+" + "```
+" + starter_code + "
+```"
+    question += "
+" + code_instruction_hasstartercode
 else:
+    question += "
+" + code_instruction_nostartercode
+final_prompt = "<｜User｜>" + question + "<｜Assistant｜><think>
+"
 ```
 4. Our inference engine for evaluation is **vLLM==0.7.3** using top-p=0.95, temperature=0.6, max_tokens=32768.
 Please check evaluation code, scripts, cached prediction files in https://huggingface.co/nvidia/AceReason-Nemotron-14B/blob/main/README_EVALUATION.md
+## Code
+Our code is available at https://github.com/NVIDIA/TRT-LLM/tree/main/examples/research/ace_reason
 ## Correspondence to
 Yang Chen ([email protected]), Zhuolin Yang ([email protected]), Zihan Liu ([email protected]), Chankyu Lee ([email protected]), Wei Ping ([email protected])
 ## License
 Your use of this model is governed by the [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/).
 ## Citation
 ```
 @article{chen2025acereason,
   journal={arXiv preprint arXiv:2505.16400},
   year={2025}
 }
+```