-
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning
Paper • 2502.14768 • Published • 48 -
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
Paper • 2502.12853 • Published • 29 -
Diverse Inference and Verification for Advanced Reasoning
Paper • 2502.09955 • Published • 18 -
Distillation Scaling Laws
Paper • 2502.08606 • Published • 48
shanshan wang
cooleel
AI & ML interests
None yet
Recent Activity
updated
a model
10 days ago
tensorlake/MonkeyOCR-Recognition
published
a model
10 days ago
tensorlake/MonkeyOCR-Recognition
liked
a Space
12 days ago
ling99/OCRBench-v2-leaderboard