add link to technical report
Browse files
README.md
CHANGED
@@ -28,7 +28,7 @@ The model underwent a multi-phase post-training process to enhance both its reas
|
|
28 |
|
29 |
This model is ready for commercial use.
|
30 |
|
31 |
-
For more details on how the model was trained, please see [
|
32 |
|
33 |

|
34 |
|
@@ -55,6 +55,7 @@ Developers designing AI Agent systems, chatbots, RAG systems, and other AI-power
|
|
55 |
|
56 |
## References
|
57 |
|
|
|
58 |
* [\[2502.00203\] Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment](https://arxiv.org/abs/2502.00203)
|
59 |
* [\[2411.19146\]Puzzle: Distillation-Based NAS for Inference-Optimized LLMs](https://arxiv.org/abs/2411.19146)
|
60 |
* [\[2503.18908\]FFN Fusion: Rethinking Sequential Computation in Large Language Models](https://arxiv.org/abs/2503.18908)
|
|
|
28 |
|
29 |
This model is ready for commercial use.
|
30 |
|
31 |
+
For more details on how the model was trained, please see our [technical report](https://arxiv.org/abs/2505.00949) and [blog](https://developer.nvidia.com/blog/build-enterprise-ai-agents-with-advanced-open-nvidia-llama-nemotron-reasoning-models/).
|
32 |
|
33 |

|
34 |
|
|
|
55 |
|
56 |
## References
|
57 |
|
58 |
+
* [\[2505.00949\] Llama-Nemotron: Efficient Reasoning Models](https://arxiv.org/abs/2505.00949)
|
59 |
* [\[2502.00203\] Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment](https://arxiv.org/abs/2502.00203)
|
60 |
* [\[2411.19146\]Puzzle: Distillation-Based NAS for Inference-Optimized LLMs](https://arxiv.org/abs/2411.19146)
|
61 |
* [\[2503.18908\]FFN Fusion: Rethinking Sequential Computation in Large Language Models](https://arxiv.org/abs/2503.18908)
|