YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Introduction

Paper: Paper,

Github: Github,

Page: Page,

SFT Dataset: OmniAlign-V,

DPO Dataset: OmniAlign-V-DPO,

MM-AlignBench: MM-AlignBench

Checkpoints: LLaVANext-OA-7B, LLaVANext-OA-32B, LLaVANext-OA-32B-DPO

This is the official repo of LLaVANext-OmniAlign(OA)-32B-DPO in OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference.

LLaVANext-OmniAlign-32B-DPO is based on LLaVA-Next structure with Qwen2.5-32B-Instruct.

By applying DPO stage using OmniAlign-V-DPO datasets, we can further improve the alignment of MLLMs with human preference.

Performance

By integrating OmniAlign-V-DPO datasets in DPO stage, we can further improve the alignment of MLLMs with human preference. Our LLaVANext-OA-32B-DPO even surpasses Qwen2VL-72B on MM-AlignBench.

Model Win Rate Reward Better+ Better Tie Worse Worse+
Claude3.5V-Sonnet 84.9 +51.4 70 144 12 31 4
GPT-4o 81.3 +49.0 81 124 12 31 4
GPT-4V 82.5 +46.0 57 157 12 31 1
GeminiFlash1.5-002 77.0 +39.1 56 138 14 35 9
LLaVANext-OA-32B-DPO 74.2 +36.9 49 138 20 40 5
Qwen2VL-72B 61.5 +21.6 43 112 15 75 7
LLaVANext-OA-32B 62.3 +19.4 31 126 19 62 14
Claude-3V-Sonnet 50 0 - - - - -
Qwen2VL-7B 44.4 -5.8 28 84 5 101 34
InternVL2-72B 44.4 -6.9 19 93 8 98 34
InternVL2-8B-MPO 40.1 -10.9 26 75 10 100 41
InternVL2-8B 31.3 -21.8 18 61 15 109 49
LLaMA3.2-Vision-11B 27.8 -33.7 18 52 4 98 80
LLaVANext-Qwen32B 26.6 -29.0 16 51 10 121 54
LLaVA-OneVision-7B 23.8 -46.2 14 46 1 75 116
MiniCPM-V-2.5 12.7 -53.0 9 23 8 116 96
Xcomposer2.5-7B 7.5 -74.0 5 14 3 63 167
Idefics3-8B 2.7 -92.3 3 4 0 15 230

How to use

Please refer to our Github for more details about training and evaluation.

Downloads last month
3
Safetensors
Model size
33.1B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Collection including PhoenixZ/LLaVANext-OmniAlign-32B-DPO