OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Abstract
Image-based virtual try-on (VTON), which aims to generate an outfitted image of a target human wearing an in-shop garment, is a challenging image-synthesis task calling for not only high fidelity of the outfitted human but also full preservation of garment details. To tackle this issue, we propose Outfitting over Try-on Diffusion (OOTDiffusion), leveraging the power of pretrained latent diffusion models and designing a novel network architecture for realistic and controllable virtual try-on. Without an explicit warping process, we propose an outfitting UNet to learn the garment detail features, and merge them with the target human body via our proposed outfitting fusion in the denoising process of diffusion models. In order to further enhance the controllability of our outfitting UNet, we introduce outfitting dropout to the training process, which enables us to adjust the strength of garment features through classifier-free guidance. Our comprehensive experiments on the VITON-HD and Dress Code datasets demonstrate that OOTDiffusion efficiently generates high-quality outfitted images for arbitrary human and garment images, which outperforms other VTON methods in both fidelity and controllability, indicating an impressive breakthrough in virtual try-on. Our source code is available at https://github.com/levihsu/OOTDiffusion.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- PFDM: Parser-Free Virtual Try-on via Diffusion Model (2024)
- Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis (2024)
- Cross-view Masked Diffusion Transformers for Person Image Synthesis (2024)
- Product-Level Try-on: Characteristics-preserving Try-on with Realistic Clothes Shading and Wrinkles (2024)
- Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
OOTDiffusion: Revolutionary Virtual Try-On Using Latent Diffusion Models
Links π:
π Subscribe: https://www.youtube.com/@Arxflix
π Twitter: https://x.com/arxflix
π LMNT (Partner): https://lmnt.com/
Models citing this paper 5
Browse 5 models citing this paperDatasets citing this paper 0
No dataset linking this paper