-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 262 -
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Paper • 2503.12605 • Published • 36 -
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
Paper • 2506.13585 • Published • 254 -
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Paper • 2503.12937 • Published • 30
Av
Avi66
·
AI & ML interests
ML Research , LLMs , Applications
MultiModality
Recent Activity
updated
a collection
30 days ago
Vlm
updated
a collection
30 days ago
Vlm
updated
a collection
30 days ago
Papers