VITA-QinYu Collection VITA-QinYu: Expressive Spoken Language Model for Role-Playing and singing • 4 items • Updated about 4 hours ago • 1
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion Paper • 2603.06577 • Published 14 days ago • 48
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization Paper • 2512.24615 • Published Dec 31, 2025 • 119
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction Paper • 2501.01957 • Published Jan 3, 2025 • 47