Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning Paper • 2504.11354 • Published Apr 15 • 5
INTELLECT-2: A Reasoning Model Trained Through Globally Decentralized Reinforcement Learning Paper • 2505.07291 • Published 22 days ago • 11
view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others • 22 days ago • 402
view article Article LeRobot Community Datasets: The “ImageNet” of Robotics — When and How? By danaaubakirova and 6 others • 23 days ago • 50
view article Article Cohere on Hugging Face Inference Providers 🔥 By burtenshaw and 6 others • Apr 16 • 126
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published Mar 18 • 128
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild Paper • 2503.18892 • Published Mar 24 • 30
view article Article The N Implementation Details of RLHF with PPO By vwxyzjn and 2 others • Oct 24, 2023 • 58
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models Paper • 1910.02054 • Published Oct 4, 2019 • 6