Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | 
         @@ -8,23 +8,25 @@ license: creativeml-openrail-m 
     | 
|
| 8 | 
         | 
| 9 | 
         
             
            
         
     | 
| 10 | 
         | 
| 11 | 
         
            -
            SDXL consists of a mixture-of-experts pipeline for latent diffusion: 
         
     | 
| 12 | 
         
             
            In a first step, the base model is used to generate (noisy) latents, 
         
     | 
| 13 | 
         
            -
            which are then further processed with a refinement model (available here:  
     | 
| 14 | 
         
             
            Note that the base model can be used as a standalone module.
         
     | 
| 15 | 
         | 
| 16 | 
         
            -
            Alternatively, we can use a two- 
     | 
| 17 | 
         
             
            First, the base model is used to generate latents of the desired output size. 
         
     | 
| 18 | 
         
             
            In the second step, we use a specialized high-resolution model and apply a technique called SDEdit (https://arxiv.org/abs/2108.01073, also known as "img2img") 
         
     | 
| 19 | 
         
            -
            to the latents generated in the first step, using the same prompt.  
     | 
| 
         | 
|
| 
         | 
|
| 20 | 
         | 
| 21 | 
         
             
            ### Model Description
         
     | 
| 22 | 
         | 
| 23 | 
         
             
            - **Developed by:** Stability AI
         
     | 
| 24 | 
         
             
            - **Model type:** Diffusion-based text-to-image generative model
         
     | 
| 25 | 
         
            -
            - **License:** [ 
     | 
| 26 | 
         
             
            - **Model Description:** This is a model that can be used to generate and modify images based on text prompts. It is a [Latent Diffusion Model](https://arxiv.org/abs/2112.10752) that uses two fixed, pretrained text encoders ([OpenCLIP-ViT/G](https://github.com/mlfoundations/open_clip) and [CLIP-ViT/L](https://github.com/openai/CLIP/tree/main)).
         
     | 
| 27 | 
         
            -
            - **Resources for more information:** [GitHub Repository](https://github.com/Stability-AI/generative-models) [SDXL  
     | 
| 28 | 
         | 
| 29 | 
         
             
            ### Model Sources
         
     | 
| 30 | 
         | 
| 
         | 
|
| 8 | 
         | 
| 9 | 
         
             
            
         
     | 
| 10 | 
         | 
| 11 | 
         
            +
            [SDXL](https://arxiv.org/abs/2307.01952) consists of a mixture-of-experts pipeline for latent diffusion: 
         
     | 
| 12 | 
         
             
            In a first step, the base model is used to generate (noisy) latents, 
         
     | 
| 13 | 
         
            +
            which are then further processed with a refinement model (available here: https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/) specialized for the final denoising steps.
         
     | 
| 14 | 
         
             
            Note that the base model can be used as a standalone module.
         
     | 
| 15 | 
         | 
| 16 | 
         
            +
            Alternatively, we can use a two-stage pipeline as follows: 
         
     | 
| 17 | 
         
             
            First, the base model is used to generate latents of the desired output size. 
         
     | 
| 18 | 
         
             
            In the second step, we use a specialized high-resolution model and apply a technique called SDEdit (https://arxiv.org/abs/2108.01073, also known as "img2img") 
         
     | 
| 19 | 
         
            +
            to the latents generated in the first step, using the same prompt. This technique is slightly slower than the first one, as it requires more function evaluations.
         
     | 
| 20 | 
         
            +
             
     | 
| 21 | 
         
            +
            Source code is available at https://github.com/Stability-AI/generative-models .
         
     | 
| 22 | 
         | 
| 23 | 
         
             
            ### Model Description
         
     | 
| 24 | 
         | 
| 25 | 
         
             
            - **Developed by:** Stability AI
         
     | 
| 26 | 
         
             
            - **Model type:** Diffusion-based text-to-image generative model
         
     | 
| 27 | 
         
            +
            - **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md)
         
     | 
| 28 | 
         
             
            - **Model Description:** This is a model that can be used to generate and modify images based on text prompts. It is a [Latent Diffusion Model](https://arxiv.org/abs/2112.10752) that uses two fixed, pretrained text encoders ([OpenCLIP-ViT/G](https://github.com/mlfoundations/open_clip) and [CLIP-ViT/L](https://github.com/openai/CLIP/tree/main)).
         
     | 
| 29 | 
         
            +
            - **Resources for more information:** Check out our [GitHub Repository](https://github.com/Stability-AI/generative-models) and the [SDXL report on arXiv](https://arxiv.org/abs/2307.01952).
         
     | 
| 30 | 
         | 
| 31 | 
         
             
            ### Model Sources
         
     | 
| 32 | 
         |