Aptronym commited on
Commit
8c909c0
·
verified ·
1 Parent(s): e2ecfd9

Upload 16 files

Browse files
Flash-README.md ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - text-to-image
4
+ - stable-diffusion
5
+ - lora
6
+ - diffusers
7
+ - template:sd-lora
8
+ base_model: stabilityai/stable-diffusion-xl-base-1.0
9
+ license: cc-by-nc-nd-4.0
10
+ ---
11
+ # ⚡ Flash Diffusion: FlashSDXL ⚡
12
+
13
+
14
+ Flash Diffusion is a diffusion distillation method proposed in [Flash Diffusion: Accelerating Any Conditional
15
+ Diffusion Model for Few Steps Image Generation](http://arxiv.org/abs/2406.02347) *by Clément Chadebec, Onur Tasar, Eyal Benaroche, and Benjamin Aubin.*
16
+ This model is a **108M** LoRA distilled version of [SDXL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) model that is able to generate images in **4 steps**. The main purpose of this model is to reproduce the main results of the paper.
17
+ See our [live demo](https://huggingface.co/spaces/jasperai/FlashPixart) and [official implementation](https://github.com/gojasper/flash-diffusion).
18
+
19
+
20
+ <p align="center">
21
+ <img style="width:700px;" src="images/flash_sdxl.jpg">
22
+ </p>
23
+
24
+ # How to use?
25
+
26
+ The model can be used using the `DiffusionPipeline` from `diffusers` library directly. It can allow reducing the number of required sampling steps to **4 steps**.
27
+
28
+ ```python
29
+ from diffusers import DiffusionPipeline, LCMScheduler
30
+
31
+ adapter_id = "jasperai/flash-sdxl"
32
+
33
+ pipe = DiffusionPipeline.from_pretrained(
34
+ "stabilityai/stable-diffusion-xl-base-1.0",
35
+ use_safetensors=True,
36
+ )
37
+
38
+ pipe.scheduler = LCMScheduler.from_pretrained(
39
+ "stabilityai/stable-diffusion-xl-base-1.0",
40
+ subfolder="scheduler",
41
+ timestep_spacing="trailing",
42
+ )
43
+ pipe.to("cuda")
44
+
45
+ # Fuse and load LoRA weights
46
+ pipe.load_lora_weights(adapter_id)
47
+ pipe.fuse_lora()
48
+
49
+ prompt = "A raccoon reading a book in a lush forest."
50
+
51
+ image = pipe(prompt, num_inference_steps=4, guidance_scale=0).images[0]
52
+ ```
53
+ <p align="center">
54
+ <img style="width:400px;" src="images/raccoon.png">
55
+ </p>
56
+
57
+ # Training Details
58
+ The model was trained for 20k iterations on 4 H100 GPUs (representing approximately a total of 176 GPU hours of training). Please refer to the [paper](http://arxiv.org/abs/2406.02347) for further parameters details.
59
+
60
+ **Metrics on COCO 2014 validation (Table 3)**
61
+ - FID-10k: 21.62 (4 NFE)
62
+ - CLIP Score: 0.327 (4 NFE)
63
+
64
+ ## Citation
65
+ If you find this work useful or use it in your research, please consider citing us
66
+
67
+ ```bibtex
68
+ @misc{chadebec2024flash,
69
+ title={Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation},
70
+ author={Clement Chadebec and Onur Tasar and Eyal Benaroche and Benjamin Aubin},
71
+ year={2024},
72
+ eprint={2406.02347},
73
+ archivePrefix={arXiv},
74
+ primaryClass={cs.CV}
75
+ }
76
+ ```
77
+
78
+ ## License
79
+ This model is released under the the Creative Commons BY-NC license.
Hyper-SD(XL).md ADDED
@@ -0,0 +1,346 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: openrail++
3
+ library_name: diffusers
4
+ inference: false
5
+ tags:
6
+ - lora
7
+ - text-to-image
8
+ - stable-diffusion
9
+ ---
10
+
11
+ # Hyper-SD
12
+ Official Repository of the paper: *[Hyper-SD](https://arxiv.org/abs/2404.13686)*.
13
+
14
+ Project Page: https://hyper-sd.github.io/
15
+
16
+ ![](./hypersd_tearser.jpg)
17
+
18
+
19
+ ## News🔥🔥🔥
20
+
21
+ * May.13, 2024. 💥💥💥 The **12-Steps CFG-Preserved** [Hyper-SDXL-12steps-CFG-LoRA](https://huggingface.co/ByteDance/Hyper-SD/blob/main/Hyper-SDXL-12steps-CFG-lora.safetensors) and [Hyper-SD15-12steps-CFG-LoRA](https://huggingface.co/ByteDance/Hyper-SD/blob/main/Hyper-SD15-12steps-CFG-lora.safetensors) is also available now(support 5~8 guidance scales), this could be more practical with better trade-off between performance and speed. Enjoy! 💥💥💥
22
+ * Apr.30, 2024. Our **8-Steps CFG-Preserved** [Hyper-SDXL-8steps-CFG-LoRA](https://huggingface.co/ByteDance/Hyper-SD/blob/main/Hyper-SDXL-8steps-CFG-lora.safetensors) and [Hyper-SD15-8steps-CFG-LoRA](https://huggingface.co/ByteDance/Hyper-SD/blob/main/Hyper-SD15-8steps-CFG-lora.safetensors) is available now(support 5~8 guidance scales), we strongly recommend making the 8-step CFGLora a standard configuration for all SDXL and SD15 models!!!
23
+ * Apr.28, 2024. ComfyUI workflows on 1-Step Unified LoRA 🥰 with TCDScheduler to inference on different steps are [released](https://huggingface.co/ByteDance/Hyper-SD/tree/main/comfyui)! Remember to install ⭕️ [ComfyUI-TCD](https://github.com/JettHu/ComfyUI-TCD) in your `ComfyUI/custom_nodes` folder!!! You're encouraged to adjust the eta parameter to get better results 🌟!
24
+ * Apr.26, 2024. Thanks to @[Pete](https://huggingface.co/pngwn) for contributing to our [scribble demo](https://huggingface.co/spaces/ByteDance/Hyper-SD15-Scribble) with larger canvas right now 👏.
25
+ * Apr.24, 2024. The ComfyUI [workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SDXL-1step-Unet-workflow.json) and [checkpoint](https://huggingface.co/ByteDance/Hyper-SD/blob/main/Hyper-SDXL-1step-Unet-Comfyui.fp16.safetensors) on 1-Step SDXL UNet ✨ is also available! Don't forget ⭕️ to install the custom [scheduler](https://huggingface.co/ByteDance/Hyper-SD/tree/main/comfyui/ComfyUI-HyperSDXL1StepUnetScheduler) in your `ComfyUI/custom_nodes` folder!!!
26
+ * Apr.23, 2024. ComfyUI workflows on N-Steps LoRAs are [released](https://huggingface.co/ByteDance/Hyper-SD/tree/main/comfyui)! Worth a try for creators 💥!
27
+ * Apr.23, 2024. Our technical report 📚 is uploaded to [arXiv](https://arxiv.org/abs/2404.13686)! Many implementation details are provided and we welcome more discussions👏.
28
+ * Apr.21, 2024. Hyper-SD ⚡️ is highly compatible and work well with different base models and controlnets. To clarify, we also append the usage example of controlnet [here](https://huggingface.co/ByteDance/Hyper-SD#controlnet-usage).
29
+ * Apr.20, 2024. Our checkpoints and two demos 🤗 (i.e. [SD15-Scribble](https://huggingface.co/spaces/ByteDance/Hyper-SD15-Scribble) and [SDXL-T2I](https://huggingface.co/spaces/ByteDance/Hyper-SDXL-1Step-T2I)) are publicly available on [HuggingFace Repo](https://huggingface.co/ByteDance/Hyper-SD).
30
+
31
+ ## Try our Hugging Face demos:
32
+ Hyper-SD Scribble demo host on [🤗 scribble](https://huggingface.co/spaces/ByteDance/Hyper-SD15-Scribble)
33
+
34
+ Hyper-SDXL One-step Text-to-Image demo host on [🤗 T2I](https://huggingface.co/spaces/ByteDance/Hyper-SDXL-1Step-T2I)
35
+
36
+ ## Introduction
37
+
38
+ Hyper-SD is one of the new State-of-the-Art diffusion model acceleration techniques.
39
+ In this repository, we release the models distilled from [SDXL Base 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) and [Stable-Diffusion v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5)。
40
+
41
+ ## Checkpoints
42
+
43
+ * `Hyper-SDXL-Nstep-lora.safetensors`: Lora checkpoint, for SDXL-related models.
44
+ * `Hyper-SD15-Nstep-lora.safetensors`: Lora checkpoint, for SD1.5-related models.
45
+ * `Hyper-SDXL-1step-unet.safetensors`: Unet checkpoint distilled from SDXL-Base.
46
+
47
+ ## Text-to-Image Usage
48
+ ### SDXL-related models
49
+ #### 2-Steps, 4-Steps, 8-steps LoRA
50
+ Take the 2-steps LoRA as an example, you can also use other LoRAs for the corresponding inference steps setting.
51
+ ```python
52
+ import torch
53
+ from diffusers import DiffusionPipeline, DDIMScheduler
54
+ from huggingface_hub import hf_hub_download
55
+ base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
56
+ repo_name = "ByteDance/Hyper-SD"
57
+ # Take 2-steps lora as an example
58
+ ckpt_name = "Hyper-SDXL-2steps-lora.safetensors"
59
+ # Load model.
60
+ pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda")
61
+ pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
62
+ pipe.fuse_lora()
63
+ # Ensure ddim scheduler timestep spacing set as trailing !!!
64
+ pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")
65
+ # lower eta results in more detail
66
+ prompt="a photo of a cat"
67
+ image=pipe(prompt=prompt, num_inference_steps=2, guidance_scale=0).images[0]
68
+ ```
69
+
70
+ #### Unified LoRA (support 1 to 8 steps inference)
71
+ You can flexibly adjust the number of inference steps and eta value to achieve best performance.
72
+ ```python
73
+ import torch
74
+ from diffusers import DiffusionPipeline, TCDScheduler
75
+ from huggingface_hub import hf_hub_download
76
+ base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
77
+ repo_name = "ByteDance/Hyper-SD"
78
+ ckpt_name = "Hyper-SDXL-1step-lora.safetensors"
79
+ # Load model.
80
+ pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda")
81
+ pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
82
+ pipe.fuse_lora()
83
+ # Use TCD scheduler to achieve better image quality
84
+ pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
85
+ # Lower eta results in more detail for multi-steps inference
86
+ eta=1.0
87
+ prompt="a photo of a cat"
88
+ image=pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, eta=eta).images[0]
89
+ ```
90
+
91
+ #### 1-step SDXL Unet
92
+ Only for the single step inference.
93
+ ```python
94
+ import torch
95
+ from diffusers import DiffusionPipeline, UNet2DConditionModel, LCMScheduler
96
+ from huggingface_hub import hf_hub_download
97
+ from safetensors.torch import load_file
98
+ base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
99
+ repo_name = "ByteDance/Hyper-SD"
100
+ ckpt_name = "Hyper-SDXL-1step-Unet.safetensors"
101
+ # Load model.
102
+ unet = UNet2DConditionModel.from_config(base_model_id, subfolder="unet").to("cuda", torch.float16)
103
+ unet.load_state_dict(load_file(hf_hub_download(repo_name, ckpt_name), device="cuda"))
104
+ pipe = DiffusionPipeline.from_pretrained(base_model_id, unet=unet, torch_dtype=torch.float16, variant="fp16").to("cuda")
105
+ # Use LCM scheduler instead of ddim scheduler to support specific timestep number inputs
106
+ pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
107
+ # Set start timesteps to 800 in the one-step inference to get better results
108
+ prompt="a photo of a cat"
109
+ image=pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, timesteps=[800]).images[0]
110
+ ```
111
+
112
+
113
+ ### SD1.5-related models
114
+
115
+ #### 2-Steps, 4-Steps, 8-steps LoRA
116
+ Take the 2-steps LoRA as an example, you can also use other LoRAs for the corresponding inference steps setting.
117
+ ```python
118
+ import torch
119
+ from diffusers import DiffusionPipeline, DDIMScheduler
120
+ from huggingface_hub import hf_hub_download
121
+ base_model_id = "runwayml/stable-diffusion-v1-5"
122
+ repo_name = "ByteDance/Hyper-SD"
123
+ # Take 2-steps lora as an example
124
+ ckpt_name = "Hyper-SD15-2steps-lora.safetensors"
125
+ # Load model.
126
+ pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda")
127
+ pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
128
+ pipe.fuse_lora()
129
+ # Ensure ddim scheduler timestep spacing set as trailing !!!
130
+ pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")
131
+ prompt="a photo of a cat"
132
+ image=pipe(prompt=prompt, num_inference_steps=2, guidance_scale=0).images[0]
133
+ ```
134
+
135
+
136
+ #### Unified LoRA (support 1 to 8 steps inference)
137
+ You can flexibly adjust the number of inference steps and eta value to achieve best performance.
138
+ ```python
139
+ import torch
140
+ from diffusers import DiffusionPipeline, TCDScheduler
141
+ from huggingface_hub import hf_hub_download
142
+ base_model_id = "runwayml/stable-diffusion-v1-5"
143
+ repo_name = "ByteDance/Hyper-SD"
144
+ ckpt_name = "Hyper-SD15-1step-lora.safetensors"
145
+ # Load model.
146
+ pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda")
147
+ pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
148
+ pipe.fuse_lora()
149
+ # Use TCD scheduler to achieve better image quality
150
+ pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
151
+ # Lower eta results in more detail for multi-steps inference
152
+ eta=1.0
153
+ prompt="a photo of a cat"
154
+ image=pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, eta=eta).images[0]
155
+ ```
156
+
157
+ ## ControlNet Usage
158
+ ### SDXL-related models
159
+
160
+ #### 2-Steps, 4-Steps, 8-steps LoRA
161
+ Take Canny Controlnet and 2-steps inference as an example:
162
+ ```python
163
+ import torch
164
+ from diffusers.utils import load_image
165
+ import numpy as np
166
+ import cv2
167
+ from PIL import Image
168
+ from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL, DDIMScheduler
169
+ from huggingface_hub import hf_hub_download
170
+
171
+ # Load original image
172
+ image = load_image("https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png")
173
+ image = np.array(image)
174
+ # Prepare Canny Control Image
175
+ low_threshold = 100
176
+ high_threshold = 200
177
+ image = cv2.Canny(image, low_threshold, high_threshold)
178
+ image = image[:, :, None]
179
+ image = np.concatenate([image, image, image], axis=2)
180
+ control_image = Image.fromarray(image)
181
+ control_image.save("control.png")
182
+ control_weight = 0.5 # recommended for good generalization
183
+
184
+ # Initialize pipeline
185
+ controlnet = ControlNetModel.from_pretrained(
186
+ "diffusers/controlnet-canny-sdxl-1.0",
187
+ torch_dtype=torch.float16
188
+ )
189
+ vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
190
+ pipe = StableDiffusionXLControlNetPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", controlnet=controlnet, vae=vae, torch_dtype=torch.float16).to("cuda")
191
+
192
+ pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SDXL-2steps-lora.safetensors"))
193
+ # Ensure ddim scheduler timestep spacing set as trailing !!!
194
+ pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")
195
+ pipe.fuse_lora()
196
+ image = pipe("A chocolate cookie", num_inference_steps=2, image=control_image, guidance_scale=0, controlnet_conditioning_scale=control_weight).images[0]
197
+ image.save('image_out.png')
198
+ ```
199
+
200
+ #### Unified LoRA (support 1 to 8 steps inference)
201
+ Take Canny Controlnet as an example:
202
+ ```python
203
+ import torch
204
+ from diffusers.utils import load_image
205
+ import numpy as np
206
+ import cv2
207
+ from PIL import Image
208
+ from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL, TCDScheduler
209
+ from huggingface_hub import hf_hub_download
210
+
211
+ # Load original image
212
+ image = load_image("https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png")
213
+ image = np.array(image)
214
+ # Prepare Canny Control Image
215
+ low_threshold = 100
216
+ high_threshold = 200
217
+ image = cv2.Canny(image, low_threshold, high_threshold)
218
+ image = image[:, :, None]
219
+ image = np.concatenate([image, image, image], axis=2)
220
+ control_image = Image.fromarray(image)
221
+ control_image.save("control.png")
222
+ control_weight = 0.5 # recommended for good generalization
223
+
224
+ # Initialize pipeline
225
+ controlnet = ControlNetModel.from_pretrained(
226
+ "diffusers/controlnet-canny-sdxl-1.0",
227
+ torch_dtype=torch.float16
228
+ )
229
+ vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
230
+ pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
231
+ "stabilityai/stable-diffusion-xl-base-1.0",
232
+ controlnet=controlnet, vae=vae, torch_dtype=torch.float16).to("cuda")
233
+
234
+ # Load Hyper-SD15-1step lora
235
+ pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SDXL-1step-lora.safetensors"))
236
+ pipe.fuse_lora()
237
+ # Use TCD scheduler to achieve better image quality
238
+ pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
239
+ # Lower eta results in more detail for multi-steps inference
240
+ eta=1.0
241
+ image = pipe("A chocolate cookie", num_inference_steps=4, image=control_image, guidance_scale=0, controlnet_conditioning_scale=control_weight, eta=eta).images[0]
242
+ image.save('image_out.png')
243
+ ```
244
+
245
+ ### SD1.5-related models
246
+
247
+ #### 2-Steps, 4-Steps, 8-steps LoRA
248
+ Take Canny Controlnet and 2-steps inference as an example:
249
+ ```python
250
+ import torch
251
+ from diffusers.utils import load_image
252
+ import numpy as np
253
+ import cv2
254
+ from PIL import Image
255
+ from diffusers import ControlNetModel, StableDiffusionControlNetPipeline, DDIMScheduler
256
+
257
+ from huggingface_hub import hf_hub_download
258
+
259
+ controlnet_checkpoint = "lllyasviel/control_v11p_sd15_canny"
260
+
261
+ # Load original image
262
+ image = load_image("https://huggingface.co/lllyasviel/control_v11p_sd15_canny/resolve/main/images/input.png")
263
+ image = np.array(image)
264
+ # Prepare Canny Control Image
265
+ low_threshold = 100
266
+ high_threshold = 200
267
+ image = cv2.Canny(image, low_threshold, high_threshold)
268
+ image = image[:, :, None]
269
+ image = np.concatenate([image, image, image], axis=2)
270
+ control_image = Image.fromarray(image)
271
+ control_image.save("control.png")
272
+
273
+ # Initialize pipeline
274
+ controlnet = ControlNetModel.from_pretrained(controlnet_checkpoint, torch_dtype=torch.float16)
275
+ pipe = StableDiffusionControlNetPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16).to("cuda")
276
+ pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SD15-2steps-lora.safetensors"))
277
+ pipe.fuse_lora()
278
+ # Ensure ddim scheduler timestep spacing set as trailing !!!
279
+ pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")
280
+ image = pipe("a blue paradise bird in the jungle", num_inference_steps=2, image=control_image, guidance_scale=0).images[0]
281
+ image.save('image_out.png')
282
+ ```
283
+
284
+
285
+ #### Unified LoRA (support 1 to 8 steps inference)
286
+ Take Canny Controlnet as an example:
287
+ ```python
288
+ import torch
289
+ from diffusers.utils import load_image
290
+ import numpy as np
291
+ import cv2
292
+ from PIL import Image
293
+ from diffusers import ControlNetModel, StableDiffusionControlNetPipeline, TCDScheduler
294
+ from huggingface_hub import hf_hub_download
295
+
296
+ controlnet_checkpoint = "lllyasviel/control_v11p_sd15_canny"
297
+
298
+ # Load original image
299
+ image = load_image("https://huggingface.co/lllyasviel/control_v11p_sd15_canny/resolve/main/images/input.png")
300
+ image = np.array(image)
301
+ # Prepare Canny Control Image
302
+ low_threshold = 100
303
+ high_threshold = 200
304
+ image = cv2.Canny(image, low_threshold, high_threshold)
305
+ image = image[:, :, None]
306
+ image = np.concatenate([image, image, image], axis=2)
307
+ control_image = Image.fromarray(image)
308
+ control_image.save("control.png")
309
+
310
+ # Initialize pipeline
311
+ controlnet = ControlNetModel.from_pretrained(controlnet_checkpoint, torch_dtype=torch.float16)
312
+ pipe = StableDiffusionControlNetPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16).to("cuda")
313
+ # Load Hyper-SD15-1step lora
314
+ pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SD15-1step-lora.safetensors"))
315
+ pipe.fuse_lora()
316
+ # Use TCD scheduler to achieve better image quality
317
+ pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
318
+ # Lower eta results in more detail for multi-steps inference
319
+ eta=1.0
320
+ image = pipe("a blue paradise bird in the jungle", num_inference_steps=1, image=control_image, guidance_scale=0, eta=eta).images[0]
321
+ image.save('image_out.png')
322
+ ```
323
+ ## Comfyui Usage
324
+ * `Hyper-SDXL-Nsteps-lora.safetensors`: [text-to-image workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SDXL-Nsteps-lora-workflow.json)
325
+ * `Hyper-SD15-Nsteps-lora.safetensors`: [text-to-image workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SD15-Nsteps-lora-workflow.json)
326
+ * `Hyper-SDXL-1step-Unet-Comfyui.fp16.safetensors`: [text-to-image workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SDXL-1step-Unet-workflow.json)
327
+ * **REQUIREMENT / INSTALL** for 1-Step SDXL UNet: Please install our [scheduler folder](https://huggingface.co/ByteDance/Hyper-SD/tree/main/comfyui/ComfyUI-HyperSDXL1StepUnetScheduler) into your `ComfyUI/custom_nodes` to enable sampling from 800 timestep instead of 999.
328
+ * i.e. making sure the `ComfyUI/custom_nodes/ComfyUI-HyperSDXL1StepUnetScheduler` folder exist.
329
+ * For more details, please refer to our [technical report](https://arxiv.org/abs/2404.13686).
330
+ * `Hyper-SD15-1step-lora.safetensors`: [text-to-image workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SD15-1step-unified-lora-workflow.json)
331
+ * `Hyper-SDXL-1step-lora.safetensors`: [text-to-image workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SDXL-1step-unified-lora-workflow.json)
332
+ * **REQUIREMENT / INSTALL** for 1-Step Unified LoRAs: Please install the [ComfyUI-TCD](https://github.com/JettHu/ComfyUI-TCD) into your `ComfyUI/custom_nodes` to enable TCDScheduler with support of different inference steps (1~8) using single checkpoint.
333
+ * i.e. making sure the `ComfyUI/custom_nodes/ComfyUI-TCD` folder exist.
334
+ * You're encouraged to adjust the eta parameter in TCDScheduler to get better results.
335
+
336
+ ## Citation
337
+ ```bibtex
338
+ @misc{ren2024hypersd,
339
+ title={Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis},
340
+ author={Yuxi Ren and Xin Xia and Yanzuo Lu and Jiacheng Zhang and Jie Wu and Pan Xie and Xing Wang and Xuefeng Xiao},
341
+ year={2024},
342
+ eprint={2404.13686},
343
+ archivePrefix={arXiv},
344
+ primaryClass={cs.CV}
345
+ }
346
+ ```
Hyper-SDXL-1step-lora.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5a75acf70ca40a9c8ab0a2dd1bf76174c64f9636e98809fe87a223616e3cc4d9
3
+ size 393854592
Hyper-SDXL-2steps-lora.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0ef7e0cc5aa10d89eb7cbb77ec15b83e6ae1a596f75a31bccb227cc6043702e1
3
+ size 393854592
Hyper-SDXL-4steps-lora.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:06c37b7cd5f5c5c2aa21deeb69c7b49b6124c4402b8dbd9e78fbf36ca3243b82
3
+ size 393854592
PCM-README.md ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: diffusers
3
+ pipeline_tag: text-to-image
4
+ ---
5
+ # Phased Consistency Model
6
+
7
+ LoRA weights of Stable Diffusion v1-5 for fast text-to-image generation.
8
+
9
+ [[paper](https://huggingface.co/papers/2405.18407)] [[arXiv](https://arxiv.org/abs/2405.18407)] [[code](https://github.com/G-U-N/Phased-Consistency-Model)] [[project page](https://g-u-n.github.io/projects/pcm)]
SLAM-README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: diffusers
3
+ base_model: stabilityai/stable-diffusion-xl-base-1.0
4
+ tags:
5
+ - text-to-image
6
+ license: apache-2.0
7
+ inference: false
8
+ ---
9
+ # Sub-path Linear Approximation Model (SLAM) LoRA: SDXL
10
+ Paper: [https://arxiv.org/abs/2404.13903](https://arxiv.org/abs/2404.13903)<br/>
11
+ Project Page: [https://subpath-linear-approx-model.github.io/](https://subpath-linear-approx-model.github.io/)<br/>
12
+ The checkpoint is a distilled from [stabilityai/stable-diffusion-xl-base-1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) with our proposed Sub-path Linear Approximation Model, which reduces the number of inference steps to only between 2-4 steps.
13
+ ## Usage
14
+ First, install the latest version of the Diffusers library as well as peft, accelerate and transformers.
15
+ ```bash
16
+ pip install --upgrade pip
17
+ pip install --upgrade diffusers transformers accelerate peft
18
+ ```
19
+ We implement SLAM to be compatible with [LCMScheduler](https://huggingface.co/docs/diffusers/v0.22.3/en/api/schedulers/lcm#diffusers.LCMScheduler). You can use SLAM-LoRA just like you use LCM-LoRA.
20
+ ```python
21
+ import torch
22
+ from diffusers import LCMScheduler, AutoPipelineForText2Image
23
+
24
+ model_id = "stabilityai/stable-diffusion-xl-base-1.0"
25
+ adapter_id = "alimama-creative/slam-lora-sdxl"
26
+
27
+ pipe = AutoPipelineForText2Image.from_pretrained(model_id, torch_dtype=torch.float16, variant="fp16")
28
+ pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
29
+ pipe.to("cuda")
30
+
31
+ # load and fuse lcm lora
32
+ pipe.load_lora_weights(adapter_id)
33
+ pipe.fuse_lora()
34
+
35
+ prompt = "A brown teddy bear holding a glass vase in front of a grave."
36
+
37
+ image = pipe(prompt=prompt, num_inference_steps=4, guidance_scale=1.0).images[0]
38
+
39
+ ```
40
+
41
+
42
+ Compare with latent-consistency/lcm-lora-sdxl.
43
+ <img src='https://huggingface.co/alimama-creative/slam-lora-sdxl/resolve/main/sdxl_cmp.jpg'>
44
+
45
+ ---
46
+
47
+ More examples:
48
+ <img src='https://huggingface.co/alimama-creative/slam-lora-sdxl/resolve/main/slam-lora-sdxl.jpg'>
49
+
50
+
51
+
TCD-SDXL.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2c777bc60abf41d3eb0fe405d23d73c280a020eea5adf97a82a141592c33feba
3
+ size 393854624
flash-sdxl.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:afe2ca6e27c4c6087f50ef42772c45d7b0efbc471b76e422492403f9cae724d7
3
+ size 371758976
pcm_sdxl_smallcfg_2step.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:242cbe4695fe3f2e248faa71cf53f2ccbf248a316973e4b2f38ab9e34f35a5ab
3
+ size 393854624
pcm_sdxl_smallcfg_4step.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d0bf40a7f280829195563486bec7253f043a06b1f218602b20901c367641023e
3
+ size 393854624
sd_xl_turbo_lora_v1.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a599c42a9f4f7494c7f410dbc0fd432cf0242720509e9d52fa41aac7a88d1b69
3
+ size 787342192
sdxl-turbo-dpo-lora.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c31b841d4574f724fa9f2600b69fd30c81d2593d2e11a3b6bdf51ec8b8761e6d
3
+ size 46615272
sdxl_lightning_2step_lora.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:04fafc778385b24144498a8247498ae4fb7a69f702ea0566bdce2845a31fcc43
3
+ size 393854592
sdxl_lightning_4step_lora.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bf56cf2657efb15e465d81402ed481d1e11c4677e4bcce1bc11fe71ad8506b79
3
+ size 393854592
slam-lora-sdxl.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:22569a946b0db645aa3b8eb782c674c8e726a7cc0d655887c21fecf6dfe6ad91
3
+ size 393854592