Autoregressive Image Generation with Randomized Parallel Decoding
Haopeng Li1, Jinyue Yang2, Guoqi Li2,π§, Huan Wang1,π§
1 Westlake University, 2 Institute of Automation, Chinese Academy of Sciences
TL;DR
ARPG is a novel autoregressive image generation framework capable of performing BERT-style masked modeling with a GPT-style causal architecture.
πͺ FID 1.94
π Fast Speed
β»οΈ Low Memory Usage
π² Radnom Order
π‘ Zero-shot Inference
Usage:
You can easily load it through the Hugging Face DiffusionPipeline and optionally customize various parameters such as the model type, number of steps, and class labels.
from diffusers import DiffusionPipeline
pipeline = DiffusionPipeline.from_pretrained("hp-l33/ARPG", custom_pipeline="hp-l33/ARPG")
class_labels = [207, 360, 388, 113, 355, 980, 323, 979]
generated_image = pipeline(
model_type="ARPG-XL", # choose from 'ARPG-L', 'ARPG-XL', or 'ARPG-XXL'
seed=0, # set a seed for reproducibility
num_steps=64, # number of autoregressive steps
class_labels=class_labels, # provide valid ImageNet class labels
cfg_scale=4, # classifier-free guidance scale
output_dir="./images", # directory to save generated images
cfg_schedule="constant", # choose between 'constant' (suggested) and 'linear'
sample_schedule="arccos", # choose between 'arccos' (suggested) and 'cosine'
)
generated_image.show()
Citation
If this work is helpful for your research, please give it a star or cite it:
@article{li2025autoregressive,
title={Autoregressive Image Generation with Randomized Parallel Decoding},
author={Haopeng Li and Jinyue Yang and Guoqi Li and Huan Wang},
journal={arXiv preprint arXiv:2503.10568},
year={2025}
}
Acknowledgement
Thanks to LlamaGen for its open-source codebase. Appreciate RandAR and RAR for inspiring this work, and also thank ControlAR.
- Downloads last month
- 48
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
HF Inference deployability: The HF Inference API does not support unconditional-image-generation models for diffusers
library.