---
license: apache-2.0
base_model:
- lmsys/vicuna-7b-v1.5
pipeline_tag: image-text-to-text
---
# FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models
Jintao Tong1,
Wenwei Jin2,
Pengda Qin2,
Anqi Li3,
Yixiong Zou1✉
Yuhong Li2✉,
Yuhua Li1,
Ruixuan Li1
1School of Computer Science and Technology, Huazhong University of Science and Technology
2Xiaohongshu Inc., 3Institute of Information Science, Beijing Jiaotong University
[](https://github.com/TungChintao/FlowCut)
[](https://arxiv.org/pdf/2505.19536)
[](https://github.com/TungChintao/FlowCut/blob/main/LICENSE)
## 💡 Highlights
> **TLDR:** To address inefficiency from excessive visual tokens in LVLMs, we propose a unified, bottom-up perspective based on information-flow, revealing dynamic redundancy emergence and introduce FlowCut, making pruning decision aligned with the model's inherent behavior, outperforming all existing approaches.
## 🛠 Preparation
Our code is easy to use.
1. Clone the [LLaVA](https://github.com/haotian-liu/LLaVA)'s repository.
```
git clone https://github.com/haotian-liu/LLaVA.git
cd LLaVA
```
2. Install the [LLaVA](https://github.com/haotian-liu/LLaVA)'s environment.
```
conda create -n llava python=3.10 -y
conda activate llava
pip install --upgrade pip
pip install -e .
pip install flash-attn --no-build-isolation
```
3. For formal usage, you can install the package from PyPI by running the following command:
```
pip install flowcut
```
For development, you can install the package by cloning the repository and running the following command:
```
git clone https://github.com/TungChintao/FlowCut
cd flowcut
pip install -e .
```
File organization as follow:
```
├── LLaVA-main
├── flowcut
├── llava
├── playground
├── script
```
## 🚀 Quick Start
```Python
from llava.model.builder import load_pretrained_model
from llava.mm_utils import get_model_name_from_path
from llava.eval.run_llava import eval_model
from flowcut import flowcut
model_path = "liuhaotian/llava-v1.5-7b"
tokenizer, model, image_processor, context_len = load_pretrained_model(
model_path=model_path,
model_base=None,
model_name=get_model_name_from_path(model_path)
)
## FlowCut retains 64 visual tokens
model = flowcut(model, target_num=64)
```
## 📖 Evaluation
The evaluation code follows the structure of [LLaVA](https://github.com/haotian-liu/LLaVA) or [Lmms-Eval](https://github.com/EvolvingLMMs-Lab/lmms-eval). After loading the model, simply add two lines as shown below:
```python
## Load LLaVA Model (code from llava.eval.model_vqa_loader)
tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, args.model_base, model_name)
## add FlowCut
from flowcut import flowcut
model = flowcut(model, target_num=64)
```
Script templetes (please follow the detailed instruction in [LLaVA-Evaluation](https://github.com/haotian-liu/LLaVA/blob/main/docs/Evaluation.md)).
```Shell
bash scripts/v1_5/eval/[Benchmark].sh
```
Examples:
```Shell
CUDA_VISIBLE_DEVICES=0 bash scripts/v1_5/eval/mme.sh
```
```Shell
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash scripts/v1_5/eval/vqav2.sh
```
## 🎯 Training
The training code follows the structure of [LLaVA](https://github.com/haotian-liu/LLaVA). After loading the model, simply add two lines as shown below:
```python
## Load LLaVA Model (code from llava.train)
code of loading model...
## add FlowCut
from flowcut import flowcut
model = flowcut(model, target_num=64)
## training
trainer = LLaVATrainer(model=model,
tokenizer=tokenizer,
args=training_args,
**data_module)
```
## 🔑 License
- This project is released under the [Apache 2.0 license](https://github.com/TungChintao/FlowCut/blob/main/LICENSE).
## 📌 Citation
- If you find this project useful in your research, please consider citing:
```bibtex
@article{tong2025flowcut,
title={FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models},
author={Tong, Jintao and Jin, Wenwei and Qin, Pengda and Li, Anqi and Zou, Yixiong and Li, Yuhong and Li, Yuhua and Li, Ruixuan},
journal={arXiv preprint arXiv:2505.19536},
year={2025}
}
```