Update README.md
Browse files
README.md
CHANGED
@@ -53,6 +53,14 @@ test("openbmb/Eurus-RM-7b")
|
|
53 |
# Output 2: 0.7317184507846832
|
54 |
```
|
55 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
56 |
## Citation
|
57 |
```
|
58 |
@misc{yuan2024advancing,
|
|
|
53 |
# Output 2: 0.7317184507846832
|
54 |
```
|
55 |
|
56 |
+
## Evaluation
|
57 |
+
- Eurus-RM-7B stands out as the best 7B RM overall and achieves similar or better performance than much larger baselines. Particularly, it outperforms GPT-4 in certain tasks.
|
58 |
+
- Our training objective is beneficial in improving RM performance on hard problems and reasoning.
|
59 |
+
- ULTRAINTERACT is compatible with other datasets like UltraFeedback and UltraSafety, and mixing these datasets can balance different RM abilities.
|
60 |
+
- Eurus-RM-7B improves LLMs’ reasoning performance by a large margin through reranking.
|
61 |
+
<img src="./figures/rm_exp.png" alt="stats" style="zoom: 40%;" />
|
62 |
+
|
63 |
+
|
64 |
## Citation
|
65 |
```
|
66 |
@misc{yuan2024advancing,
|