[ECCV 2024] Localizing moments in videos via text queries
A Chain-of-LoRA Agent for Long Video Reasoning