Point, Detect, Count: Multi-Task Medical Image Understanding with Instruction-Tuned Vision-Language Models Paper • 2505.16647 • Published May 22 • 1
Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal Endoscopy Paper • 2506.09958 • Published Jun 11 • 1
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published Dec 4, 2024 • 134
Enhancing Structured-Data Retrieval with GraphRAG: Soccer Data Case Study Paper • 2409.17580 • Published Sep 26, 2024 • 9
SoccerNet-Echoes: A Soccer Game Audio Commentary Dataset Paper • 2405.07354 • Published May 12, 2024 • 2