VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos Paper • 2505.23693 • Published 20 days ago • 56
IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval Paper • 2503.04644 • Published Mar 6 • 21