jinzhuoran HongbangYuan commited on
Commit
269b286
Β·
verified Β·
1 Parent(s): 00162e0

Update README.md (#1)

Browse files

- Update README.md (edaa81c4db93276ed3a81737045e74f4b6b153f7)


Co-authored-by: HongbangYuan <[email protected]>

Files changed (1) hide show
  1. README.md +100 -1
README.md CHANGED
@@ -4,4 +4,103 @@ datasets:
4
  - jinzhuoran/OmniRewardData
5
  base_model:
6
  - openbmb/MiniCPM-o-2_6
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - jinzhuoran/OmniRewardData
5
  base_model:
6
  - openbmb/MiniCPM-o-2_6
7
+ ---
8
+
9
+
10
+
11
+ # Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences
12
+
13
+
14
+ <p align="center">
15
+ <a href="https://huggingface.co/datasets/HongbangYuan/OmniRewardBench"> πŸ€— Benchmark</a></a> |
16
+ <a href="https://hf.co/datasets/jinzhuoran/OmniRewardData"> πŸ€— Dataset</a> |
17
+ <a href="https://hf.co/jinzhuoran/OmniRewardModel"> πŸ€— Model</a> |
18
+ <a href="https://omnireward.github.io/"> 🏠 Homepage</a>
19
+ </p>
20
+
21
+
22
+
23
+ ## 🧩 Overview
24
+
25
+ **OmniRewardModel** is our pretrained **discriminative reward model** designed to handle *omni-modal* tasks (e.g., text, image, video) and *free-form human preferences*.
26
+
27
+ It is built upon the open-source base model [MiniCPM-o-2_6](https://huggingface.co/openbmb/MiniCPM-o-2_6), with an additional **value head** appended to produce scalar reward scores.
28
+
29
+ The model supports fine-grained scoring across various tasks and modalities, and can be seamlessly loaded via Hugging Face Hub.
30
+
31
+ ---
32
+
33
+
34
+ ## πŸ› οΈ Environment Setup
35
+
36
+
37
+ To reproduce the training process in our paper, please make sure to set up the environment as described below.
38
+ Our training code is built upon the [llama-factory](https://github.com/hiyouga/llama-factory) framework.
39
+
40
+ ```bash
41
+ git clone https://github.com/HongbangYuan/OmniReward.git
42
+ conda create -n omnireward python=3.10
43
+ conda activate omnireward
44
+ ```
45
+
46
+ We recommend using **`torch==2.2.0`** for best compatibility.
47
+
48
+ Install PyTorch (choose one based on your CUDA version):
49
+
50
+ ```bash
51
+ # For CUDA 11.8:
52
+ pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 \
53
+ --index-url https://download.pytorch.org/whl/cu118
54
+
55
+ # For CUDA 12.1:
56
+ pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 \
57
+ --index-url https://download.pytorch.org/whl/cu121
58
+ ```
59
+
60
+ Then install the remaining dependencies:
61
+
62
+ ```bash
63
+ cd OmniReward/OmniReward-Factory
64
+ pip install -r requirements.txt
65
+ ```
66
+
67
+ ## πŸ“¦ Data Preparation
68
+
69
+ Download all required training and evaluation datasets from [OmniRewardData](https://huggingface.co/datasets/jinzhuoran/OmniRewardData) and [OmniRewardBench](https://huggingface.co/datasets/HongbangYuan/OmniRewardBench):
70
+
71
+ ```bash
72
+ cd OmniReward-Factory
73
+ bash scripts/download.sh
74
+ ```
75
+
76
+ ## πŸ‹οΈβ€β™€οΈ Training Omni-Reward
77
+
78
+ To reproduce the training results described in our paper, please navigate to the OmniReward-Factory directory and run the following scripts:
79
+
80
+ ```bash
81
+ cd OmniReward-Factory
82
+ bash scripts/train.sh
83
+ bash scripts/train_t2t.sh
84
+ bash scripts/train_ti2t.sh
85
+ bash scripts/train_t2iv.sh
86
+ ```
87
+ ## πŸ“ˆ Loading and Evaluating Omni-Reward
88
+
89
+ You can also directly use our pretrained Omni-Reward for evaluation without retraining.
90
+
91
+ The models are publicly available at:
92
+
93
+ πŸ‘‰ https://huggingface.co/jinzhuoran/OmniRewardModel
94
+
95
+ ```bash
96
+ cd OmniReward-Factory
97
+ bash scripts/eval_t2t.sh
98
+ bash scripts/eval_t2t_tie.sh
99
+ bash scripts/eval_ti2t.sh
100
+ bash scripts/eval_ti2t_tie.sh
101
+ ```
102
+
103
+ - `--eval_dataset`: Specifies the evaluation dataset (e.g., `omni_t2t`, `omni_t2i`, `omni_t2v`, etc.).
104
+
105
+ - `--eval_tie`: Enables w/ Ties evaluation.
106
+