File size: 2,445 Bytes
8af41d7 2e05679 03861e5 8af41d7 2e05679 16d0288 2e05679 39a7604 52bb97d 39a7604 52bb97d 39a7604 2e05679 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
---
license: mit
pipeline_tag: unconditional-image-generation
library_name: diffusers
---
# Autoregressive Image Generation with Randomized Parallel Decoding
[Haopeng Li](https://github.com/hp-l33)<sup>1</sup>, Jinyue Yang<sup>2</sup>, [Guoqi Li](https://casialiguoqi.github.io)<sup>2,π§</sup>, [Huan Wang](https://huanwang.tech)<sup>1,π§</sup>
<sup>1</sup> Westlake University,
<sup>2</sup> Institute of Automation, Chinese Academy of Sciences
## TL;DR
**ARPG** is a novel autoregressive image generation framework capable of performing **BERT-style masked modeling** with a **GPT-style causal architecture**.
``πͺ FID 1.94`` ``π Fast Speed`` ``β»οΈ Low Memory Usage`` ``π² Radnom Order`` ``π‘ Zero-shot Inference``
## Usage:
You can easily load it through the Hugging Face DiffusionPipeline and optionally customize various parameters such as the model type, number of steps, and class labels.
```python
from diffusers import DiffusionPipeline
pipeline = DiffusionPipeline.from_pretrained("hp-l33/ARPG", custom_pipeline="hp-l33/ARPG")
class_labels = [207, 360, 388, 113, 355, 980, 323, 979]
generated_image = pipeline(
model_type="ARPG-XL", # choose from 'ARPG-L', 'ARPG-XL', or 'ARPG-XXL'
seed=0, # set a seed for reproducibility
num_steps=64, # number of autoregressive steps
class_labels=class_labels, # provide valid ImageNet class labels
cfg_scale=4, # classifier-free guidance scale
output_dir="./images", # directory to save generated images
cfg_schedule="constant", # choose between 'constant' (suggested) and 'linear'
sample_schedule="arccos", # choose between 'arccos' (suggested) and 'cosine'
)
generated_image.show()
```
## Citation
If this work is helpful for your research, please give it a star or cite it:
```bibtex
@article{li2025autoregressive,
title={Autoregressive Image Generation with Randomized Parallel Decoding},
author={Haopeng Li and Jinyue Yang and Guoqi Li and Huan Wang},
journal={arXiv preprint arXiv:2503.10568},
year={2025}
}
```
## Acknowledgement
Thanks to [LlamaGen](https://github.com/FoundationVision/LlamaGen) for its open-source codebase. Appreciate [RandAR](https://github.com/ziqipang/RandAR) and [RAR](https://github.com/bytedance/1d-tokenizer/blob/main/README_RAR.md) for inspiring this work, and also thank [ControlAR](https://github.com/hustvl/ControlAR). |