Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published 8 days ago • 100
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents Paper • 2506.03143 • Published 12 days ago • 46
A Controllable Examination for Long-Context Language Models Paper • 2506.02921 • Published 12 days ago • 32
ARIA: Training Language Agents with Intention-Driven Reward Aggregation Paper • 2506.00539 • Published 16 days ago • 30
AI Paper of the Day Collection A collection of papers that I think are interesting, one added each day • 385 items • Updated about 14 hours ago • 48
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows Paper • 2505.19897 • Published 21 days ago • 101 • 2
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows Paper • 2505.19897 • Published 21 days ago • 101
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows Paper • 2505.19897 • Published 21 days ago • 101
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis Paper • 2505.13227 • Published 27 days ago • 45
OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning Paper • 2505.08617 • Published May 13 • 41