LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing Paper • 2311.00571 • Published Nov 1, 2023 • 43
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation Paper • 2505.04512 • Published May 7 • 35
Morph: A Motion-free Physics Optimization Framework for Human Motion Generation Paper • 2411.14951 • Published Nov 22, 2024 • 2
Morph: A Motion-free Physics Optimization Framework for Human Motion Generation Paper • 2411.14951 • Published Nov 22, 2024 • 2 • 2
M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Paper • 2405.16273 • Published May 25, 2024 • 1
Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence Lip-Reading Paper • 2003.03983 • Published Mar 9, 2020
Synchronous Bidirectional Learning for Multilingual Lip Reading Paper • 2005.03846 • Published May 8, 2020
M^3GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation Paper • 2405.16273 • Published May 25, 2024 • 1