view article Article What Open-Source Developers Need to Know about the EU AI Act's Rules for GPAI Models By yjernite and 5 others β’ 6 days ago β’ 23
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization Paper β’ 2507.14683 β’ Published 22 days ago β’ 123
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data By danaaubakirova and 8 others β’ Jun 3 β’ 221
view article Article π€ππ¬π₯οΈπ Kimi-VL-A3B-Thinking-2506: A Quick Navigation By moonshotai and 1 other β’ Jun 21 β’ 66
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper β’ 2506.07044 β’ Published Jun 8 β’ 110
EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World? Paper β’ 2506.05287 β’ Published Jun 5 β’ 15
Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models Paper β’ 2505.16854 β’ Published May 22 β’ 11
Optimizing Anytime Reasoning via Budget Relative Policy Optimization Paper β’ 2505.13438 β’ Published May 19 β’ 36
Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency Paper β’ 2504.18589 β’ Published Apr 24 β’ 13
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models Paper β’ 2504.15271 β’ Published Apr 21 β’ 66
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy Paper β’ 2503.19757 β’ Published Mar 25 β’ 52
Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks Paper β’ 2503.21696 β’ Published Mar 27 β’ 23
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper β’ 2503.14476 β’ Published Mar 18 β’ 137
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search Paper β’ 2503.10582 β’ Published Mar 13 β’ 23
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding Paper β’ 2503.02951 β’ Published Mar 4 β’ 32