VideoGameQA-Bench: Evaluating Vision-Language Models for Video Game Quality Assurance Paper • 2505.15952 • Published 3 days ago • 18
ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models Paper • 2502.09696 • Published Feb 13 • 44
AutoTrain: No-code training for state-of-the-art models Paper • 2410.15735 • Published Oct 21, 2024 • 60