Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models Paper • 2505.16854 • Published 2 days ago • 9
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval Paper • 2505.16967 • Published 2 days ago • 16
Let LLMs Break Free from Overthinking via Self-Braking Tuning Paper • 2505.14604 • Published 4 days ago • 19
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning Paper • 2505.16933 • Published 2 days ago • 23
BLEUBERI: BLEU is a surprisingly effective reward for instruction following Paper • 2505.11080 • Published 9 days ago • 3
ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning Paper • 2505.15776 • Published 3 days ago • 9
Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space Paper • 2505.15778 • Published 3 days ago • 10
When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning Paper • 2505.15400 • Published 4 days ago • 20
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective Paper • 2505.15045 • Published 4 days ago • 47
Web-Shepherd: Advancing PRMs for Reinforcing Web Agents Paper • 2505.15277 • Published 4 days ago • 92
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training Paper • 2505.11594 • Published 8 days ago • 56
General-Reasoner: Advancing LLM Reasoning Across All Domains Paper • 2505.14652 • Published 4 days ago • 17
Think Only When You Need with Large Hybrid-Reasoning Models Paper • 2505.14631 • Published 4 days ago • 18
Emerging Properties in Unified Multimodal Pretraining Paper • 2505.14683 • Published 4 days ago • 116