LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis Paper • 2412.15214 • Published 6 days ago • 14
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding Paper • 2403.15377 • Published Mar 22 • 22
Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding Paper • 2403.09626 • Published Mar 14 • 13
VideoMamba: State Space Model for Efficient Video Understanding Paper • 2403.06977 • Published Mar 11 • 27
Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering Paper • 2312.00109 • Published Nov 30, 2023 • 9
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation Paper • 2307.06942 • Published Jul 13, 2023 • 22
InternChat: Solving Vision-Centric Tasks by Interacting with Chatbots Beyond Language Paper • 2305.05662 • Published May 9, 2023 • 4