Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ license: cc-by-nc-4.0
|
|
8 |
---
|
9 |
|
10 |
## Introduction
|
11 |
-
E1-Code-14B is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-14B. It is trained for Elastic Reasoning by budget-constrained rollout strategy, integrated into GRPO, which teaches the model to reason adaptively when the thinking process is cut short and generalizes effectively to unseen budget constraints without additional training.
|
12 |
|
13 |
## Usage
|
14 |
For detailed usage, please refer to [repo](https://github.com/SalesforceAIResearch/Elastic-Reasoning).
|
|
|
8 |
---
|
9 |
|
10 |
## Introduction
|
11 |
+
E1-Code-14B is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-14B. It is trained for [**Elastic Reasoning**](https://arxiv.org/pdf/2505.05315) by budget-constrained rollout strategy, integrated into GRPO, which teaches the model to reason adaptively when the thinking process is cut short and generalizes effectively to unseen budget constraints without additional training.
|
12 |
|
13 |
## Usage
|
14 |
For detailed usage, please refer to [repo](https://github.com/SalesforceAIResearch/Elastic-Reasoning).
|