metadata

license: mit
pipeline_tag: unconditional-image-generation
library_name: diffusers

Autoregressive Image Generation with Randomized Parallel Decoding

Haopeng Li¹, Jinyue Yang², Guoqi Li^2,📧, Huan Wang^1,📧

¹ Westlake University, ² Institute of Automation, Chinese Academy of Sciences

Introduction

We introduce a novel autoregressive image generation framework named ARPG. This framework is capable of conducting BERT-style masked modeling by employing a GPT-style causal architecture. Consequently, it is able to generate images in parallel following a random token order and also provides support for the KV cache.

💪 ARPG achieves an FID of 1.94
🚀 ARPG delivers throughput 26 times faster than LlamaGen.
♻️ ARPG reducing memory consumption by over 75% compared to VAR.
🔍 ARPG supports zero-shot inference (e.g., inpainting and outpainting).
🛠️ ARPG can be easily extended to controllable generation.

Usage:

You can easily load it through the Hugging Face DiffusionPipeline and optionally customize various parameters such as the model type, number of steps, and class labels.

from diffusers import DiffusionPipeline

pipeline = DiffusionPipeline.from_pretrained("hp-l33/ARPG", custom_pipeline="hp-l33/ARPG")

class_labels = [207, 360, 388, 113, 355, 980, 323, 979]

generated_image = pipeline(
    model_type="ARPG-XL",       # choose from 'ARPG-L', 'ARPG-XL', or 'ARPG-XXL'
    seed=0,                     # set a seed for reproducibility
    num_steps=64,               # number of autoregressive steps
    class_labels=class_labels,  # provide valid ImageNet class labels
    cfg_scale=4,                # classifier-free guidance scale
    output_dir="./images",      # directory to save generated images
    cfg_schedule="constant",    # choose between 'constant' (suggested) and 'linear'
    sample_schedule="arccos",   # choose between 'arccos' (suggested) and 'cosine'
)

generated_image.show()

Citation

If this work is helpful for your research, please give it a star or cite it:

@article{li2025autoregressive,
    title={Autoregressive Image Generation with Randomized Parallel Decoding},
    author={Haopeng Li and Jinyue Yang and Guoqi Li and Huan Wang},
    journal={arXiv preprint arXiv:2503.10568},
    year={2025}
}

Acknowledgement

Thanks to LlamaGen for its open-source codebase. Appreciate RandAR and RAR for inspiring this work, and also thank ControlAR.