Update README.md (#1)

Browse files

- Update README.md (edaa81c4db93276ed3a81737045e74f4b6b153f7)

Co-authored-by: HongbangYuan <[email protected]>

Files changed (1) hide show

README.md +100 -1

README.md CHANGED Viewed

@@ -4,4 +4,103 @@ datasets:
 - jinzhuoran/OmniRewardData
 base_model:
 - openbmb/MiniCPM-o-2_6
----

 - jinzhuoran/OmniRewardData
 base_model:
 - openbmb/MiniCPM-o-2_6
+---
+# Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences
+<p align="center">
+  <a href="https://huggingface.co/datasets/HongbangYuan/OmniRewardBench"> 🤗 Benchmark</a></a> |
+  <a href="https://hf.co/datasets/jinzhuoran/OmniRewardData"> 🤗 Dataset</a> |
+  <a href="https://hf.co/jinzhuoran/OmniRewardModel"> 🤗 Model</a> |
+  <a href="https://omnireward.github.io/"> 🏠 Homepage</a>
+</p>
+## 🧩 Overview
+**OmniRewardModel** is our pretrained **discriminative reward model** designed to handle *omni-modal* tasks (e.g., text, image, video) and *free-form human preferences*.
+It is built upon the open-source base model [MiniCPM-o-2_6](https://huggingface.co/openbmb/MiniCPM-o-2_6), with an additional **value head** appended to produce scalar reward scores.
+The model supports fine-grained scoring across various tasks and modalities, and can be seamlessly loaded via Hugging Face Hub.
+---
+## 🛠️ Environment Setup
+To reproduce the training process in our paper, please make sure to set up the environment as described below.
+Our training code is built upon the [llama-factory](https://github.com/hiyouga/llama-factory)  framework.
+```bash
+git clone https://github.com/HongbangYuan/OmniReward.git
+conda create -n omnireward python=3.10
+conda activate omnireward
+```
+We recommend using **`torch==2.2.0`** for best compatibility.
+Install PyTorch (choose one based on your CUDA version):
+```bash
+# For CUDA 11.8:
+pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 \
+    --index-url https://download.pytorch.org/whl/cu118
+# For CUDA 12.1:
+pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 \
+    --index-url https://download.pytorch.org/whl/cu121
+```
+Then install the remaining dependencies:
+```bash
+cd OmniReward/OmniReward-Factory
+pip install -r requirements.txt
+```
+## 📦 Data Preparation
+Download all required training and evaluation datasets from [OmniRewardData](https://huggingface.co/datasets/jinzhuoran/OmniRewardData) and [OmniRewardBench](https://huggingface.co/datasets/HongbangYuan/OmniRewardBench):
+```bash
+cd OmniReward-Factory
+bash scripts/download.sh
+```
+## 🏋️‍♀️  Training Omni-Reward
+To reproduce the training results described in our paper, please navigate to the OmniReward-Factory directory and run the following scripts:
+```bash
+cd OmniReward-Factory
+bash scripts/train.sh
+bash scripts/train_t2t.sh
+bash scripts/train_ti2t.sh
+bash scripts/train_t2iv.sh
+```
+## 📈  Loading and Evaluating Omni-Reward
+You can also directly use our pretrained Omni-Reward for evaluation without retraining.
+The models are publicly available at:
+👉 https://huggingface.co/jinzhuoran/OmniRewardModel
+```bash
+cd OmniReward-Factory
+bash scripts/eval_t2t.sh
+bash scripts/eval_t2t_tie.sh
+bash scripts/eval_ti2t.sh
+bash scripts/eval_ti2t_tie.sh
+```
+- `--eval_dataset`: Specifies the evaluation dataset (e.g., `omni_t2t`, `omni_t2i`, `omni_t2v`, etc.).
+- `--eval_tie`: Enables w/ Ties evaluation.