Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities Paper β’ 2308.12966 β’ Published Aug 24, 2023 β’ 7
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models Paper β’ 2311.07919 β’ Published Nov 14, 2023 β’ 9
Audio Dialogues: Dialogues dataset for audio and music understanding Paper β’ 2404.07616 β’ Published Apr 11 β’ 15
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper β’ 2412.10360 β’ Published 13 days ago β’ 131