Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Paper • 2501.09686 • Published Jan 16 • 36
Optimizing Large Language Model Training Using FP4 Quantization Paper • 2501.17116 • Published 24 days ago • 35
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search Paper • 2502.02508 • Published 17 days ago • 21
Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning Paper • 2502.03275 • Published 16 days ago • 13
ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization Paper • 2502.04306 • Published 15 days ago • 18
BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation Paper • 2502.03860 • Published 16 days ago • 23