infly
/

INFRL-Qwen2.5-VL-72B-Preview

Visual Question Answering

image-text-to-text

text-generation-inference

Model card Files Files and versions Community

Update README.md

#1

by qc808082 - opened 5 days ago

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +10 -0

README.md CHANGED Viewed

@@ -16,6 +16,16 @@ pipeline_tag: visual-question-answering
 - As of March 25th, 2025, **INFRL-Qwen2.5-VL-72B-Preview** is the best-performing open-sourced VL model on various visual reasoning benchmarks ([MathVision](https://mathllm.github.io/mathvision/),[MathVista](https://mathvista.github.io/), [EMMA](https://emma-benchmark.github.io/#leaderboard), [MMMUPro](https://mmmu-benchmark.github.io/), [MathVerse](https://mathverse-cuhk.github.io/)).
 ## Evaluation
 We will release a code repository with vLLM support for VLM evaluation.

 - As of March 25th, 2025, **INFRL-Qwen2.5-VL-72B-Preview** is the best-performing open-sourced VL model on various visual reasoning benchmarks ([MathVision](https://mathllm.github.io/mathvision/),[MathVista](https://mathvista.github.io/), [EMMA](https://emma-benchmark.github.io/#leaderboard), [MMMUPro](https://mmmu-benchmark.github.io/), [MathVerse](https://mathverse-cuhk.github.io/)).
+| Models            | MathVision (test) | MathVista (testmini) | MathVerse (testmini) |
+|-------------------|-------------------|----------------------|----------------------|
+| GPT4o (R1-1V Rep) | 30.6              | 60                   | 41.2                 |
+| Gemini-2.0-Flash  | 41.3              | 70.1                 | 50.6                 |
+| Claude 3.5 Sonnet | 33.5              | 67.7                 | 47.8                 |
+| QvQ-72B           | 35.9              | 71.4                 | 48.6                 |
+| InternVL2.5-78B   | 34.9              | 72.3                 | 51.7                 |
+| Qwen-VL-2.5-72B   | 38.1              | 74.8                 | 57.18                |
+| INFRL-VL-Preview  | 41.9              | 77.8                 | 58.84                |
 ## Evaluation
 We will release a code repository with vLLM support for VLM evaluation.