MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly Paper • 2505.10610 • Published 9 days ago • 52
Running 306 306 Qwen2.5 Omni 7B Demo 🏆 Generate text and speech responses from text, images, or audio input
φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation Paper • 2503.13288 • Published Mar 17 • 51