MovieQA: Understanding Stories in Movies through Question-Answering Paper • 1512.02902 • Published Dec 9, 2015
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips Paper • 1906.03327 • Published Jun 7, 2019 • 1
How you feelin'? Learning Emotions and Mental States in Movie Scenes Paper • 2304.05634 • Published Apr 12, 2023
Major Entity Identification: A Generalizable Alternative to Coreference Resolution Paper • 2406.14654 • Published Jun 20, 2024
No Detail Left Behind: Revisiting Self-Retrieval for Fine-Grained Image Captioning Paper • 2409.03025 • Published Sep 4, 2024 • 1
The Sound of Water: Inferring Physical Properties from Pouring Liquids Paper • 2411.11222 • Published Nov 18, 2024
VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment Paper • 2406.10889 • Published Jun 16, 2024 • 2