-
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
Paper • 2501.02955 • Published • 45 -
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 107 -
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Paper • 2501.12380 • Published • 86 -
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos
Paper • 2501.09781 • Published • 29
sergicalsix
sergicalsix
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
3 days ago
Scaling Test-time Compute for LLM Agents
upvoted
a
paper
3 days ago
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning
Attention
upvoted
a
paper
3 days ago
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Organizations
None yet