Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
gaotang
's Collections
Knowledge Conflict
RM-R1
RM-R1
updated
7 days ago
RM-R1: Reward Modeling as Reasoning
Upvote
6
RM-R1: Reward Modeling as Reasoning
Paper
•
2505.02387
•
Published
9 days ago
•
66
gaotang/RM-R1-Distill-SFT
Viewer
•
Updated
6 days ago
•
8.75k
•
96
•
1
gaotang/RM-R1-after-Distill-RLVR
Viewer
•
Updated
6 days ago
•
64.2k
•
89
•
1
gaotang/RM-R1-Entire-RLVR-Train
Viewer
•
Updated
6 days ago
•
73k
•
115
•
1
gaotang/RM-R1-Qwen2.5-Instruct-32B
Text Ranking
•
Updated
6 days ago
•
20
•
1
gaotang/RM-R1-Qwen2.5-Instruct-14B
Text Ranking
•
Updated
6 days ago
•
19
•
1
gaotang/RM-R1-DeepSeek-Distilled-Qwen-14B
Text Ranking
•
Updated
6 days ago
•
71
•
1
gaotang/RM-R1-Qwen2.5-Instruct-7B
Text Ranking
•
Updated
6 days ago
•
19
•
2
gaotang/RM-R1-DeepSeek-Distilled-Qwen-7B
Updated
8 days ago
•
43
gaotang/RM-R1-DeepSeek-Distilled-Qwen-32B
Text Ranking
•
Updated
6 days ago
•
100
gaotang/RM-R1-Reasoning-RLVR
Viewer
•
Updated
7 days ago
•
73k
•
73
Upvote
6
+2
Share collection
View history
Collection guide
Browse collections