CogVLM2: Visual Language Models for Image and Video Understanding Paper • 2408.16500 • Published Aug 29 • 56
Gemma 2 2B Release Collection The 2.6B parameter version of Gemma 2. • 6 items • Updated 13 days ago • 77
MotionBooth: Motion-Aware Customized Text-to-Video Generation Paper • 2406.17758 • Published Jun 25 • 18
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens Paper • 2404.03413 • Published Apr 4 • 25
Brain2Music: Reconstructing Music from Human Brain Activity Paper • 2307.11078 • Published Jul 20, 2023 • 40