Papers
arxiv:2505.07447

Unified Continuous Generative Models

Published on May 12
· Submitted by kenshinn on May 13
Authors:
,

Abstract

Recent advances in continuous generative models, including multi-step approaches like diffusion and flow-matching (typically requiring 8-1000 sampling steps) and few-step methods such as consistency models (typically 1-8 steps), have demonstrated impressive generative performance. However, existing work often treats these approaches as distinct paradigms, resulting in separate training and sampling methodologies. We introduce a unified framework for training, sampling, and analyzing these models. Our implementation, the Unified Continuous Generative Models Trainer and Sampler (UCGM-{T,S}), achieves state-of-the-art (SOTA) performance. For example, on ImageNet 256x256 using a 675M diffusion transformer, UCGM-T trains a multi-step model achieving 1.30 FID in 20 steps and a few-step model reaching 1.42 FID in just 2 steps. Additionally, applying UCGM-S to a pre-trained model (previously 1.26 FID at 250 steps) improves performance to 1.06 FID in only 40 steps. Code is available at: https://github.com/LINs-lab/UCGM.

Community

Paper author

We have introduced a unified framework (UCGM) for training, sampling, and analyzing both multi-step models like diffusion and flow-matching, as well as few-step methods such as consistency models.
Notably, we achieve state-of-the-art (SOTA) performance on ImageNet 256x256 (1.06 FID with 40 sampling steps, 1.42 FID with 2 sampling steps) and ImageNet 512x512 (1.24 FID with 150 sampling steps, 1.75 FID with 2 sampling steps)!

Code: https://github.com/LINs-lab/UCGM

Paper submitter

We have introduced a unified framework (UCGM) for training, sampling, and analyzing both multi-step models like diffusion and flow-matching, as well as few-step methods such as consistency models.
Notably, we achieve state-of-the-art (SOTA) performance on ImageNet 256x256 (1.06 FID with 40 sampling steps, 1.42 FID with 2 sampling steps) and ImageNet 512x512 (1.24 FID with 150 sampling steps, 1.75 FID with 2 sampling steps)!

Code: https://github.com/LINs-lab/UCGM

image.png

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2505.07447 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2505.07447 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2505.07447 in a Space README.md to link it from this page.

Collections including this paper 3