arxiv:2501.03262
Jian Hu
chuyi777
AI & ML interests
Reinforcement Learning
Recent Activity
upvoted
a
paper
2 days ago
s1: Simple test-time scaling
liked
a model
20 days ago
CohereForAI/c4ai-command-r7b-12-2024
upvoted
a
paper
28 days ago
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language
Models