Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ license: cc-by-nc-4.0
|
|
8 |
---
|
9 |
|
10 |
## Introduction
|
11 |
-
E1-Math-1.5B is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B. It is trained for Elastic Reasoning by budget-constrained rollout strategy, integrated into GRPO, which teaches the model to reason adaptively when the thinking process is cut short and generalizes effectively to unseen budget constraints without additional training.
|
12 |
|
13 |
## Performance (Avg@16)
|
14 |
|
|
|
8 |
---
|
9 |
|
10 |
## Introduction
|
11 |
+
E1-Math-1.5B is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B. It is trained for [**Elastic Reasoning**](https://arxiv.org/pdf/2505.05315) by budget-constrained rollout strategy, integrated into GRPO, which teaches the model to reason adaptively when the thinking process is cut short and generalizes effectively to unseen budget constraints without additional training.
|
12 |
|
13 |
## Performance (Avg@16)
|
14 |
|