Kai Chen
hellock
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 13 hours ago
Pre-Trained Policy Discriminators are General Reward Models
upvoted
a
paper
5 months ago
Exploring the Limit of Outcome Reward for Learning Mathematical
Reasoning
updated
a model
6 months ago
internlm/internlm3-8b-instruct