jinzhuoran
/

OmniRewardModel

Model card Files Files and versions Community

OmniRewardModel / README.md

jinzhuoran's picture

Update README.md (#1)

269b286 verified about 1 month ago

|

history blame contribute delete

3.11 kB

	---
	license: cc-by-nc-4.0
	datasets:
	- jinzhuoran/OmniRewardData
	base_model:
	- openbmb/MiniCPM-o-2_6
	---



	# Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences


	<p align="center">
	<a href="https://huggingface.co/datasets/HongbangYuan/OmniRewardBench"> 🤗 Benchmark</a></a> \|
	<a href="https://hf.co/datasets/jinzhuoran/OmniRewardData"> 🤗 Dataset</a> \|
	<a href="https://hf.co/jinzhuoran/OmniRewardModel"> 🤗 Model</a> \|
	<a href="https://omnireward.github.io/"> 🏠 Homepage</a>
	</p>



	## 🧩 Overview

	OmniRewardModel is our pretrained discriminative reward model designed to handle omni-modal tasks (e.g., text, image, video) and free-form human preferences.

	It is built upon the open-source base model [MiniCPM-o-2_6](https://huggingface.co/openbmb/MiniCPM-o-2_6), with an additional value head appended to produce scalar reward scores.

	The model supports fine-grained scoring across various tasks and modalities, and can be seamlessly loaded via Hugging Face Hub.

	---


	## 🛠️ Environment Setup


	To reproduce the training process in our paper, please make sure to set up the environment as described below.
	Our training code is built upon the [llama-factory](https://github.com/hiyouga/llama-factory) framework.

	```bash
	git clone https://github.com/HongbangYuan/OmniReward.git
	conda create -n omnireward python=3.10
	conda activate omnireward
	```

	We recommend using `torch==2.2.0` for best compatibility.

	Install PyTorch (choose one based on your CUDA version):

	```bash
	# For CUDA 11.8:
	pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 \
	--index-url https://download.pytorch.org/whl/cu118

	# For CUDA 12.1:
	pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 \
	--index-url https://download.pytorch.org/whl/cu121
	```

	Then install the remaining dependencies:

	```bash
	cd OmniReward/OmniReward-Factory
	pip install -r requirements.txt
	```

	## 📦 Data Preparation

	Download all required training and evaluation datasets from [OmniRewardData](https://huggingface.co/datasets/jinzhuoran/OmniRewardData) and [OmniRewardBench](https://huggingface.co/datasets/HongbangYuan/OmniRewardBench):

	```bash
	cd OmniReward-Factory
	bash scripts/download.sh
	```

	## 🏋️‍♀️ Training Omni-Reward

	To reproduce the training results described in our paper, please navigate to the OmniReward-Factory directory and run the following scripts:

	```bash
	cd OmniReward-Factory
	bash scripts/train.sh
	bash scripts/train_t2t.sh
	bash scripts/train_ti2t.sh
	bash scripts/train_t2iv.sh
	```
	## 📈 Loading and Evaluating Omni-Reward

	You can also directly use our pretrained Omni-Reward for evaluation without retraining.

	The models are publicly available at:

	👉 https://huggingface.co/jinzhuoran/OmniRewardModel

	```bash
	cd OmniReward-Factory
	bash scripts/eval_t2t.sh
	bash scripts/eval_t2t_tie.sh
	bash scripts/eval_ti2t.sh
	bash scripts/eval_ti2t_tie.sh
	```

	- `--eval_dataset`: Specifies the evaluation dataset (e.g., `omni_t2t`, `omni_t2i`, `omni_t2v`, etc.).

	- `--eval_tie`: Enables w/ Ties evaluation.