metadata
license: mit
base_model:
- stabilityai/sdxl-turbo
sdxl-turbo-unified-reward-dpo
Model Summary
This model is trained on sdxl-turbo based on DPO preference data constructed by our UnifiedReward-7B for enhanced image generation quality.
For further details, please refer to the following resources:
- π° Paper: https://arxiv.org/pdf/2503.05236
- πͺ Project Page: https://codegoat24.github.io/UnifiedReward/
- π€ Model Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-models-67c3008148c3a380d15ac63a
- π€ Dataset Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-training-data-67c300d4fd5eff00fa7f1ede
- π Point of Contact: Yibin Wang
Quick Start
SDXL-Turbo does not make use of guidance_scale
or negative_prompt
, we disable it with guidance_scale=0.0
.
Preferably, the model generates images of size 512x512 but higher image sizes work as well.
A single step is enough to generate high quality images.
from diffusers import AutoPipelineForText2Image
import torch
pipe = AutoPipelineForText2Image.from_pretrained("CodeGoat24/sdxl-turbo-unified-reward-dpo", torch_dtype=torch.float16, variant="fp16")
pipe.to("cuda")
prompt = "A cinematic shot of a baby racoon wearing an intricate italian priest robe."
image = pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0.0).images[0]
Citation
@article{UnifiedReward,
title={Unified Reward Model for Multimodal Understanding and Generation.},
author={Wang, Yibin and Zang, Yuhang, and Li, Hao and Jin, Cheng and Wang Jiaqi},
journal={arXiv preprint arXiv:2503.05236},
year={2025}
}