YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Introduction

Paper: Paper,

Github: Github,

Page: Page,

SFT Dataset: OmniAlign-V,

DPO Dataset: OmniAlign-V-DPO,

MM-AlignBench: MM-AlignBench

Checkpoints: LLaVANext-OA-7B, LLaVANext-OA-32B, LLaVANext-OA-32B-DPO

This is the official repo of LLaVANext-OmniAlign(OA)-7B in OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference.

LLaVANext-OmniAlign-7B is based on LLaVA-Next structure with InternLM2.5-7B-chat.

By combining LLaVA-Next-SFT-738k-multimodal and OmniAlign-V datasets, we can significantly improve the alignment of MLLMs with human preference and enhance the performance of MLLMs on common downstream tasks, especially on MMVet and MMMU.

Performance

By integrating OmniAlign-V datasets in Supervised Fine-tuning(SFT) stage, we can not only significantly improve the alignment of MLLMs with human preference, but also enhance the performance of MLLMs on common downstream tasks, especially on MMVet and MMMU.

Model Data LLM MM-AlignBench WildVision MIA-Bench MMVet MMMU MMBenchV1.1 AI2D OCRBench
LLaVA LLaVANext-778k InternLM2.5-7B 3.6 / -82.1 18.4 / -55.1 75.4 41.2 42.6 73.6 74.1 39.7
LLaVA OmniAlign-V_mix InternLM2.5-7B 50.0 / +3.8 28.2 / -34.6 85.4 43.5 43.3 73.7 74.7 41.3
+ 46.4 / 85.9 + 9.8 / 20.5 + 10.0 + 2.3 + 0.7 + 0.1 + 0.6 + 1.6
LLaVANext LLaVANext-778k InternLM2.5-7B 20.6 / -42.7 23.4 / -45.0 76.9 41.8 44.1 75.1 74.7 56.2
LLaVANext OmniAlign-V_mix InternLM2.5-7B 57.1 / +11.1 29.6 / -31.3 86.7 47.7 46.8 74.9 77.5 58.9
+ 36.5 / 53.8 + 6.2 / 13.7 + 9.8 + 5.9 + 2.7 - 0.2 + 2.8 + 2.7
LLaVANext LLaVANext-778k Qwen2.5-32B 26.6 / -29.0 25.2 / -41.3 86.0 47.7 55.2 79.3 79.6 55.9
LLaVANext OmniAlign-V_mix Qwen2.5-32B 62.3 / +19.4 40.2 / -14.9 89.6 56.9 60.7 80.6 81.7 55.9
+ 35.7 / 48.4 + 15.0/26.4 + 3.6 + 9.2 + 5.5 + 1.3 + 2.1 + 0.0

For MM-AlignBench and WildVision, A/B denotes Winning Rate/Reward.

How to use

Please refer to our Github for more details about training and evaluation.

Downloads last month
2
Safetensors
Model size
8.44B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Collection including PhoenixZ/LLaVANext-OmniAlign-7B