Papers
arxiv:2307.01952

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

Published on Jul 4, 2023
· Submitted by akhaliq on Jul 6, 2023
#1 Paper of the day

Abstract

We present SDXL, a latent diffusion model for text-to-image synthesis. Compared to previous versions of Stable Diffusion, SDXL leverages a three times larger UNet backbone: The increase of model parameters is mainly due to more attention blocks and a larger cross-attention context as SDXL uses a second text encoder. We design multiple novel conditioning schemes and train SDXL on multiple aspect ratios. We also introduce a refinement model which is used to improve the visual fidelity of samples generated by SDXL using a post-hoc image-to-image technique. We demonstrate that SDXL shows drastically improved performance compared the previous versions of Stable Diffusion and achieves results competitive with those of black-box state-of-the-art image generators. In the spirit of promoting open research and fostering transparency in large model training and evaluation, we provide access to code and model weights at https://github.com/Stability-AI/generative-models

Community

Hey, Im reviewing deep learning papers on twitter daily in Hebrew via hashtag #https://twitter.com/hashtag/shorthebrewpapereviews?src=hashtag_click. So far I've shortly reviewed about deep learning papers. You are invited to follow and comment

This paper review can be found at https://twitter.com/MikeE_3_14/status/1677747429221838848?s=20

Indian natraj with pickleball rackets in his hands

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

This comment has been hidden

SDXL: A New Benchmark in High-Resolution Image Synthesis

Links 🔗:

👉 Subscribe: https://www.youtube.com/@Arxflix
👉 Twitter: https://x.com/arxflix
👉 LMNT (Partner): https://lmnt.com/

By Arxflix
9t4iCUHx_400x400-1.jpg

This comment has been hidden

Sign up or log in to comment

Models citing this paper 59

Browse 59 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2307.01952 in a dataset README.md to link it from this page.

Spaces citing this paper 2,448

Collections including this paper 29