Seeing More, Saying More: Lightweight Language Experts are Dynamic Video Token Compressors Paper • 2509.00969 • Published Aug 31 • 2
ActiveVLN: Towards Active Exploration via Multi-Turn RL in Vision-and-Language Navigation Paper • 2509.12618 • Published Sep 16 • 1