File size: 1,738 Bytes
fb52007
 
 
 
 
 
 
 
 
 
 
 
 
3d3b614
fb52007
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3d3b614
 
 
 
 
 
fb52007
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
---
license: mit
base_model:
- stabilityai/sdxl-turbo
---

# sdxl-turbo-unified-reward-dpo

## Model Summary

This model is trained on sdxl-turbo based on DPO preference data constructed by our [UnifiedReward-7B](https://huggingface.co/CodeGoat24/UnifiedReward-7b) for enhanced image generation quality.

For further details, please refer to the following resources:
- πŸ“° Paper: https://arxiv.org/pdf/2503.05236
- πŸͺ Project Page: https://codegoat24.github.io/UnifiedReward/
- πŸ€— Model Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-models-67c3008148c3a380d15ac63a
- πŸ€— Dataset Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-training-data-67c300d4fd5eff00fa7f1ede
- πŸ‘‹ Point of Contact: [Yibin Wang](https://codegoat24.github.io)


### Quick Start

SDXL-Turbo does not make use of `guidance_scale` or `negative_prompt`, we disable it with `guidance_scale=0.0`.
Preferably, the model generates images of size 512x512 but higher image sizes work as well.
A **single step** is enough to generate high quality images.

```py
from diffusers import AutoPipelineForText2Image
import torch

pipe = AutoPipelineForText2Image.from_pretrained("CodeGoat24/sdxl-turbo-unified-reward-dpo", torch_dtype=torch.float16, variant="fp16")
pipe.to("cuda")

prompt = "A cinematic shot of a baby racoon wearing an intricate italian priest robe."

image = pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0.0).images[0]
```

## Citation

```
@article{UnifiedReward,
  title={Unified Reward Model for Multimodal Understanding and Generation.},
  author={Wang, Yibin and Zang, Yuhang, and Li, Hao and Jin, Cheng and Wang Jiaqi},
  journal={arXiv preprint arXiv:2503.05236},
  year={2025}
}
```