INFRL-Qwen2.5-VL-72B-Preview

Model Overview

INFRL-Qwen2.5-VL-72B-Preview improves visual reasoning upon Qwen2.5-VL-72B-Instruct model.
As of March 25th, 2025, INFRL-Qwen2.5-VL-72B-Preview is the best-performing open-sourced VL model on various visual reasoning benchmarks (MathVision,MathVista, MathVerse).

Evaluation

Models	MathVision (test)	MathVista (testmini)	MathVerse (testmini)
GPT4o	30.6	60	41.2
Gemini-2.0-Flash	41.3	70.1	50.6
Claude 3.5 Sonnet	33.5	67.7	47.8
QvQ-72B	35.9	71.4	48.6
InternVL2.5-78B	34.9	72.3	51.7
Qwen-VL-2.5-72B	38.1	74.8	57.18
INFRL-VL-Preview	41.9	77.8	58.84

We will release a code repository for VLM evaluation. It supports RL training with simple rule-based rewards, meanwhile aligning with LLM-Judge results.

Stay tuned!

Contributors

Supervisors

Wei Chu • Yuan Qi

VL Team

Haozhe Wang • Zuming Huang

RL Team

Haozhe Wang • Chao Qu • Long Li

Thanks

Thanks to Jiaran Hao, Liuyihan Song for supports in the RL infrastructure.

Citation

If you find our model useful, please consider citing:

@misc {INFRL_VL_Preview,
    author       = { {Wang, Haozhe and Huang, Zuming and Qu, Chao and Chu, Wei and Qi, Yuan} },
    title        = { INFRL-Qwen2.5-VL-72B-Preview },
    year         = 2025,
    url          = { https://huggingface.co/infly/INFRL-Qwen2.5-VL-72B-Preview},
    publisher    = { Hugging Face }
}

infly
/

INFRL-Qwen2.5-VL-72B-Preview