DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 6 days ago • 226
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models Paper • 2501.11873 • Published 7 days ago • 59
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published 11 days ago • 65
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 13 days ago • 268
Agent Laboratory: Using LLM Agents as Research Assistants Paper • 2501.04227 • Published 20 days ago • 81
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published 19 days ago • 89