FocusedAD: Character-centric Movie Audio Description Paper • 2504.12157 • Published about 1 month ago • 9
Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding Paper • 2504.10465 • Published Apr 14 • 28
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding Paper • 2504.13180 • Published 29 days ago • 17