MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark Paper • 2410.19168 • Published Oct 24, 2024 • 21
view article Article You could have designed state of the art positional encoding By FL33TW00D-HF • Nov 25, 2024 • 313
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14 • 277