Safetensors
qwen2
hanbin commited on
Commit
9db8d95
·
verified ·
1 Parent(s): 93492c6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -1
README.md CHANGED
@@ -2,4 +2,49 @@
2
  license: apache-2.0
3
  ---
4
 
5
- ## Introduction
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  ---
4
 
5
+ # Eurus-2-7B-SFT
6
+
7
+ ## Links
8
+
9
+ - 📜 [Blog]()
10
+ - 🤗 [PRIME Collection](https://huggingface.co/PRIME-RL)
11
+ - 🤗 [SFT Data](https://huggingface.co/datasets/PRIME-RL/Eurus-2-SFT-Data)
12
+
13
+ ## Introduction
14
+
15
+ Eurus-2-7B-SFT is fine-tuned from [Qwen2.5-Math-7B-Base](https://huggingface.co/Qwen/Qwen2.5-Math-7B) for its great mathematical capabilities. It trains on [Eurus-2-SFT-Data](https://huggingface.co/datasets/PRIME-RL/Eurus-2-SFT-Data), which is an action-centric chain-of-thought reasoning dataset.
16
+
17
+ We apply imitation learning (supervised finetuning) as a warmup stage to teach models to learn reasoning patterns, , serving as a starter model for [Eurus-2-7B-PRIME](https://huggingface.co/PRIME-RL/Eurus-2-7B-PRIME).
18
+
19
+ ## Usage
20
+
21
+ We apply tailored prompts for coding and math task:
22
+
23
+ **Coding**
24
+
25
+ ```
26
+ \nWhen tackling complex reasoning tasks, you have access to the following actions. Use them as needed to progress through your thought process.\n\n[ASSESS]\n\n[ADVANCE]\n\n[VERIFY]\n\n[SIMPLIFY]\n\n[SYNTHESIZE]\n\n[PIVOT]\n\n[OUTPUT]\n\nYou should strictly follow the format below:\n\n[ACTION NAME]\n\n# Your action step 1\n\n# Your action step 2\n\n# Your action step 3\n\n...\n\nNext action: [NEXT ACTION NAME]\n
27
+ ```
28
+
29
+ **Coding**
30
+
31
+ ```
32
+ {question} + "\n\nWrite Python code to solve the problem. Present the code in \n```python\nYour code\n```\nat the end.
33
+ ```
34
+
35
+ **Math**
36
+
37
+ ```
38
+ {question} + "\n\nPresent the answer in LaTex format: \\boxed{Your answer}"
39
+ ```
40
+
41
+ ## Evaluation
42
+
43
+ After finetuning, the performance of our Eurus-2-7B-SFT is shown in the following figure.
44
+
45
+ ![image-20241230162026156](./figures/performance.jpg)
46
+
47
+ ## Citation
48
+
49
+ ```
50
+ ```