Update README.md
Browse files
README.md
CHANGED
@@ -7,7 +7,9 @@ base_model:
|
|
7 |
pipeline_tag: reinforcement-learning
|
8 |
tags:
|
9 |
- code
|
10 |
-
new_version: wizardII/ArcherCodeR-1.5B
|
|
|
|
|
11 |
---
|
12 |
|
13 |
|
@@ -81,4 +83,4 @@ Coming soon.
|
|
81 |
## Acknowledgements
|
82 |
|
83 |
- We build our model upon [`DeepSeek-R1-Distill-Qwen-1.5B`](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B).
|
84 |
-
- Training was carried out with a modified version of [verl](https://github.com/volcengine/verl).
|
|
|
7 |
pipeline_tag: reinforcement-learning
|
8 |
tags:
|
9 |
- code
|
10 |
+
new_version: wizardII/ArcherCodeR-1.5B-DAPO
|
11 |
+
language:
|
12 |
+
- en
|
13 |
---
|
14 |
|
15 |
|
|
|
83 |
## Acknowledgements
|
84 |
|
85 |
- We build our model upon [`DeepSeek-R1-Distill-Qwen-1.5B`](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B).
|
86 |
+
- Training was carried out with a modified version of [verl](https://github.com/volcengine/verl).
|