YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Introduction

Paper: Paper,

Github: Github,

Page: Page,

SFT Dataset: OmniAlign-V,

DPO Dataset: OmniAlign-V-DPO,

MM-AlignBench: VLMEvalkit, Huggingface

Checkpoints: LLaVANext-OA-7B, LLaVANext-OA-32B, LLaVANext-OA-32B-DPO

This is the official repo of LLaVANext-OmniAlign(OA)-7B in OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference.

LLaVANext-OmniAlign-7B is based on LLaVA-Next structure with InternLM2.5-7B-chat.

By combining LLaVA-Next-SFT-738k-multimodal and OmniAlign-V datasets, we can significantly improve the alignment of MLLMs with human preference and enhance the performance of MLLMs on common downstream tasks, especially on MMVet and MMMU.

Performance

By integrating OmniAlign-V datasets in Supervised Fine-tuning(SFT) stage, we can not only significantly improve the alignment of MLLMs with human preference, but also enhance the performance of MLLMs on common downstream tasks, especially on MMVet and MMMU.

Model Data LLM MM-AlignBench WildVision MIA-Bench MMVet MMMU MMBenchV1.1 AI2D OCRBench
LLaVA LLaVANext-778k InternLM2.5-7B 3.6 / -82.1 18.4 / -55.1 75.4 41.2 42.6 73.6 74.1 39.7
LLaVA OmniAlign-V_mix InternLM2.5-7B 50.0 / +3.8 28.2 / -34.6 85.4 43.5 43.3 73.7 74.7 41.3
+ 46.4 / 85.9 + 9.8 / 20.5 + 10.0 + 2.3 + 0.7 + 0.1 + 0.6 + 1.6
LLaVANext LLaVANext-778k InternLM2.5-7B 20.6 / -42.7 23.4 / -45.0 76.9 41.8 44.1 75.1 74.7 56.2
LLaVANext OmniAlign-V_mix InternLM2.5-7B 57.1 / +11.1 29.6 / -31.3 86.7 47.7 46.8 74.9 77.5 58.9
+ 36.5 / 53.8 + 6.2 / 13.7 + 9.8 + 5.9 + 2.7 - 0.2 + 2.8 + 2.7
LLaVANext LLaVANext-778k Qwen2.5-32B 26.6 / -29.0 25.2 / -41.3 86.0 47.7 55.2 79.3 79.6 55.9
LLaVANext OmniAlign-V_mix Qwen2.5-32B 62.3 / +19.4 40.2 / -14.9 89.6 56.9 60.7 80.6 81.7 55.9
+ 35.7 / 48.4 + 15.0/26.4 + 3.6 + 9.2 + 5.5 + 1.3 + 2.1 + 0.0

For MM-AlignBench and WildVision, A/B denotes Winning Rate/Reward.

How to use

Please refer to our Github for more details about training and evaluation.

Downloads last month
17
Safetensors
Model size
8.44B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including PhoenixZ/LLaVANext-OmniAlign-7B