Heron-Bench: A Benchmark for Evaluating Vision Language Models in Japanese Paper • 2404.07824 • Published Apr 11, 2024 • 3
NuScenes-MQA: Integrated Evaluation of Captions and QA for Autonomous Driving Datasets using Markup Annotations Paper • 2312.06352 • Published Dec 11, 2023 • 1
Evaluation of Large Language Models for Decision Making in Autonomous Driving Paper • 2312.06351 • Published Dec 11, 2023 • 6