GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning Paper • 2506.16141 • Published 20 days ago • 27
PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models Paper • 2506.16054 • Published 20 days ago • 59
SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning Paper • 2506.01713 • Published Jun 2 • 46
Through the Valley: Path to Effective Long CoT Training for Small Language Models Paper • 2506.07712 • Published 30 days ago • 18
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation Paper • 2505.05422 • Published May 8 • 8