John M commited on
Commit
f6a2086
2 Parent(s): 71b97d3 95b7c0d

Merge branch 'main' of https://huggingface.co/hotshotco/SDXL-512 into main

Browse files
Files changed (1) hide show
  1. README.md +58 -4
README.md CHANGED
@@ -5,15 +5,69 @@ tags:
5
  - stable-diffusion
6
  ---
7
 
8
- ![image/gif](https://cdn-uploads.huggingface.co/production/uploads/637a6daf7ce76c3b83497ea2/ux_sZKB9snVPsKRT1TzfG.gif)
 
 
 
 
 
 
 
 
 
 
 
9
 
10
  # Model Description
11
  - **Developed by**: Natural Synthetics Inc.
12
  - **Model type**: Diffusion-based text-to-image generative model
13
  - **License**: CreativeML Open RAIL++-M License
14
- - **Model Description**: This is a model that can be used to generate and modify images based on text prompts. It is a Latent Diffusion Model that uses two fixed, pretrained text encoders (OpenCLIP-ViT/G and CLIP-ViT/L).
15
- - **Resources for more information**: Check out our [GitHub Repository](https://github.com/hotshotco/hotshot-xl).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
 
17
 
18
  # Limitations and Bias
19
  ## Limitations
@@ -23,4 +77,4 @@ tags:
23
  - Faces and people in general may not be generated properly.
24
 
25
  ## Bias
26
- While the capabilities of video generation models are impressive, they can also reinforce or exacerbate social biases.
 
5
  - stable-diffusion
6
  ---
7
 
8
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/637a6daf7ce76c3b83497ea2/FAHjxgN2tk6uXmQAUeFI5.jpeg)
9
+
10
+ <hr>
11
+
12
+ # Overview
13
+ SDXL-512 is a checkpoint fine-tuned from SDXL 1.0 that is designed to more simply generate higher-fidelity images at and around the 512x512 resolution. The model has been fine-tuned using a learning rate of 1e-6 over 7000 steps with a batch size of 64 on a curated dataset of multiple aspect ratios. alternating low and high resolution batches (per aspect ratio) so as not to impair the base model's existing performance at higher resolution.
14
+
15
+ *Note:* It bears repeating that SDXL-512 was not trained to be "better" than SDXL, but rather to simplify prompting for higher-fidelity outputs at and around the 512x512 resolution.
16
+
17
+ - **Use it with [Hotshot-XL](https://huggingface.co/hotshotco/Hotshot-XL) (recommended)**
18
+
19
+ <hr>
20
 
21
  # Model Description
22
  - **Developed by**: Natural Synthetics Inc.
23
  - **Model type**: Diffusion-based text-to-image generative model
24
  - **License**: CreativeML Open RAIL++-M License
25
+ - **Model Description**: This is a model that can be used to generate and modify higher-fidelity images at and around the 512x512 resolution.
26
+ - **Resources for more information**: Check out our [GitHub Repository](https://github.com/hotshotco/Hotshot-XL).
27
+ - **Finetuned from model**: [Stable Diffusion XL 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)
28
+
29
+ <hr>
30
+
31
+ # 🧨 Diffusers
32
+
33
+ Make sure to upgrade diffusers to >= 0.18.2:
34
+ ```
35
+ pip install diffusers --upgrade
36
+ ```
37
+
38
+ In addition make sure to install `transformers`, `safetensors`, `accelerate` as well as the invisible watermark:
39
+ ```
40
+ pip install invisible_watermark transformers accelerate safetensors
41
+ ```
42
+
43
+ Running the pipeline (if you don't swap the scheduler it will run with the default **EulerDiscreteScheduler** in this example we are swapping it to **EulerAncestralDiscreteScheduler**:
44
+ ```py
45
+ from diffusers import StableDiffusionXLPipeline, EulerAncestralDiscreteScheduler
46
+
47
+ pipe = StableDiffusionXLPipeline.from_pretrained(
48
+ "hotshotco/SDXL-512",
49
+ use_safetensors=True,
50
+ ).to('cuda')
51
+
52
+ pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
53
+
54
+ prompt = "a woman laughing"
55
+ negative_prompt = ""
56
+
57
+ image = pipe(
58
+ prompt,
59
+ negative_prompt=negative_prompt,
60
+ width=512,
61
+ height=512,
62
+ target_size=(1024, 1024),
63
+ original_size=(4096, 4096),
64
+ num_inference_steps=50
65
+ ).images[0]
66
+
67
+ image.save("woman_laughing.png")
68
+ ```
69
 
70
+ <hr>
71
 
72
  # Limitations and Bias
73
  ## Limitations
 
77
  - Faces and people in general may not be generated properly.
78
 
79
  ## Bias
80
+ While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.