Unified Reward Model for Multimodal Understanding and Generation Paper • 2503.05236 • Published 3 days ago • 51
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 4 days ago • 66
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published 12 days ago • 67
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks Paper • 2502.17157 • Published 14 days ago • 51
SurveyX: Academic Survey Automation via Large Language Models Paper • 2502.14776 • Published 17 days ago • 92
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning Paper • 2502.14768 • Published 17 days ago • 44
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 17 days ago • 128
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation Paper • 2502.13143 • Published 19 days ago • 29
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published 25 days ago • 143
mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data Paper • 2502.08468 • Published 26 days ago • 13
Analyze Feature Flow to Enhance Interpretation and Steering in Language Models Paper • 2502.03032 • Published Feb 5 • 58
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 199
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 108
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 341
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper • 2501.11425 • Published Jan 20 • 92