OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection Paper • 2502.20361 • Published Feb 27 • 1
Structured-Noise Masked Modeling for Video, Audio and Beyond Paper • 2503.16311 • Published Mar 20 • 1
Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think Paper • 2504.20708 • Published 12 days ago • 22
Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs Paper • 2504.15280 • Published 19 days ago • 23