Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ license: cc-by-nc-4.0
|
|
10 |
## Introduction
|
11 |
E1-Math-1.5B is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B. It is trained for Elastic Reasoning by budget-constrained rollout strategy, integrated into GRPO, which teaches the model to reason adaptively when the thinking process is cut short and generalizes effectively to unseen budget constraints without additional training.
|
12 |
|
13 |
-
## Performance
|
14 |
|
15 |
| Model | Tokens | Acc (%) | Tokens | Acc (%) | Tokens | Acc (%) | Tokens | Acc (%) | Tokens | Acc (%) |
|
16 |
|---------------|--------------|---------------|--------------|---------------|--------------|---------------|--------------|---------------|--------------|---------------|
|
|
|
10 |
## Introduction
|
11 |
E1-Math-1.5B is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B. It is trained for Elastic Reasoning by budget-constrained rollout strategy, integrated into GRPO, which teaches the model to reason adaptively when the thinking process is cut short and generalizes effectively to unseen budget constraints without additional training.
|
12 |
|
13 |
+
## Performance (Avg@16)
|
14 |
|
15 |
| Model | Tokens | Acc (%) | Tokens | Acc (%) | Tokens | Acc (%) | Tokens | Acc (%) | Tokens | Acc (%) |
|
16 |
|---------------|--------------|---------------|--------------|---------------|--------------|---------------|--------------|---------------|--------------|---------------|
|