Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth Paper • 2509.03867 • Published 3 days ago • 161
Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions? Paper • 2509.04292 • Published 2 days ago • 45
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published 12 days ago • 179
InternVL3.5 Collection This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 54 items • Updated 9 days ago • 85
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published about 1 month ago • 170
Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL Paper • 2508.07976 • Published 27 days ago • 48
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 234
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization Paper • 2507.14683 • Published Jul 19 • 128
Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful Paper • 2507.07101 • Published Jul 9 • 3
LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers Paper • 2507.04404 • Published Jul 6 • 21
BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity Paper • 2507.08771 • Published Jul 11 • 9
🧠SmolLM3 Collection Smol, multilingual, long-context reasoner • 12 items • Updated Aug 5 • 72