Min-Li's picture
Create README.md
e1179ea verified
|
raw
history blame
3.12 kB
metadata
license: other
license_name: skywork-community-license
license_link: >-
  https://github.com/SkyworkAI/Skywork-Reward/blob/main/misc/Skywork%20Community%20License.pdf
datasets:
  - nvidia/HelpSteer2
language:
  - en
metrics:
  - accuracy
base_model:
  - Skywork/Skywork-Reward-Gemma-2-27B-v0.2
library_name: transformers

Interpreting Language Model Preferences Through the Lens of Decision Trees

RewardBench Leaderboard (Jan 2025)

Rank Model Base Model Method Overall Score Chat Chat Hard Safety Reasoning
1 Decision-Tree-Reward-Gemma-2-27B Gemma-2-27B Decision Tree 95.3 96.9 91.4 93.7 99.1
2 INF-QRM-Llama3.1-70B Llama-3.1-70B Sequence Classifier 95.1 96.6 91.0 93.6 99.1
3 QRM-Gemma-2-27B Gemma-2-27B Sequence Classifier 94.4 96.6 90.1 92.7 98.3
4 Skywork-Reward-Gemma-2-27B-v0.2 Gemma-2-27B Sequence Classifier 94.3 96.1 89.9 93.0 98.1
5 Decision-Tree-Reward-Llama-3.1-8B Llama-3.1-8B Decision Tree 94.3 96.9 89.3 92.9 98.5
6 Llama-3.1-Nemotron-70B-Reward Llama-3.1-70B Custom Classifier 94.1 97.5 85.7 95.1 98.1
7 Skywork-Reward-Gemma-2-27B Gemma-2-27B Sequence Classifier 93.8 95.8 91.4 91.9 96.1
8 TextEval-Llama3.1-70B Llama-3.1-70B Generative 93.5 94.1 90.1 93.2 96.4
9 MetaMetrics-RM-v1.0 - Custom Classifier 93.4 98.3 86.4 90.8 98.2
10 Skywork-Critic-Llama-3.1-70B Llama-3.1-70B Generative 93.3 96.6 87.9 93.1 95.5
11 QRM-Llama3.1-8B-v2 Llama-3.1-8B Sequence Classifier 93.1 96.4 86.8 92.6 96.8
12 Skywork-Reward-Llama-3.1-8B-v0.2 Llama-3.1-8B Sequence Classifier 93.1 94.7 88.4 92.7 96.7

License

Note: This model is finetuned from a Skywork model under the following license agreement:

The community usage of Skywork model requires Skywork Community License. The Skywork model supports commercial use. If you plan to use the Skywork model or its derivatives for commercial purposes, you must abide by terms and conditions within Skywork Community License.

To-Do

  • Reward Model Usage code
  • Architecture diagram