OzzyGT HF Staff commited on
Commit
c10b7b4
·
verified ·
1 Parent(s): 4043cb4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +118 -1
README.md CHANGED
@@ -1,7 +1,124 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
4
 
5
  This is just the transformer model with the fused 8-steps [lighting lora](https://huggingface.co/lightx2v/Qwen-Image-Lightning)
6
 
7
- Original model: [Qwen-Image](https://huggingface.co/Qwen/Qwen-Image)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ library_name: diffusers
4
+ pipeline_tag: text-to-image
5
  ---
6
 
7
  This is just the transformer model with the fused 8-steps [lighting lora](https://huggingface.co/lightx2v/Qwen-Image-Lightning)
8
 
9
+ Original model: [Qwen-Image](https://huggingface.co/Qwen/Qwen-Image)
10
+
11
+ I'm using this repository for testing purposes, so keep that in mind, this is not the official method to use it.
12
+
13
+ # How to test (24GB GPU)
14
+
15
+ Install diffusers from main:
16
+
17
+ ```sh
18
+ pip install git+https://github.com/huggingface/diffusers
19
+ ```
20
+
21
+ ```python
22
+ import torch
23
+
24
+ from diffusers import DiffusionPipeline, GGUFQuantizationConfig, QwenImageTransformer2DModel
25
+
26
+
27
+ torch_dtype = torch.bfloat16
28
+ model_id = "Qwen/Qwen-Image"
29
+
30
+ transformer = QwenImageTransformer2DModel.from_single_file(
31
+ "https://huggingface.co/OzzyGT/qwen-image-lighting-gguf/blob/main/qwen-image-lighting-Q4_K_S.gguf",
32
+ quantization_config=GGUFQuantizationConfig(compute_dtype=torch_dtype),
33
+ torch_dtype=torch_dtype,
34
+ config="Qwen/Qwen-Image",
35
+ subfolder="transformer",
36
+ )
37
+ pipe = DiffusionPipeline.from_pretrained(model_id, transformer=transformer, torch_dtype=torch_dtype)
38
+ pipe.enable_model_cpu_offload()
39
+ prompt = "stock photo of two people, a man and a woman, wearing lab coats writing on a white board with markers, the white board has text that reads 'The Diffusers library by Hugging Face makes it easy for developers to run image generation and inference using state-of-the-art diffusion models with just a few lines of code' with sloppy writing and traces clearly made by a human. The photo is taken from the side and has depth of field so some parts of the board looks blurred giving it a more professional look"
40
+
41
+ generator = torch.Generator(device="cuda").manual_seed(42)
42
+
43
+ image = pipe(
44
+ prompt=prompt,
45
+ negative_prompt="",
46
+ width=1664,
47
+ height=928,
48
+ num_inference_steps=8,
49
+ true_cfg_scale=1.0,
50
+ generator=generator,
51
+ ).images[0]
52
+
53
+ image.save("gguf_lighting_qwen.png")
54
+ ```
55
+
56
+ ## Result
57
+
58
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63df091910678851bb0cd0e0/zfv69rbwD0dMJoa0QgeJj.png)
59
+
60
+ # How to test (16GB GPU)
61
+
62
+ Install diffusers from main:
63
+
64
+ ```sh
65
+ pip install git+https://github.com/huggingface/diffusers
66
+ ```
67
+
68
+ ```python
69
+ import torch
70
+ from transformers import BitsAndBytesConfig as TransformersBitsAndBytesConfig
71
+ from transformers import Qwen2_5_VLForConditionalGeneration
72
+
73
+ from diffusers import DiffusionPipeline, GGUFQuantizationConfig, QwenImageTransformer2DModel
74
+
75
+
76
+ torch_dtype = torch.bfloat16
77
+ model_id = "Qwen/Qwen-Image"
78
+
79
+ transformer = QwenImageTransformer2DModel.from_single_file(
80
+ "https://huggingface.co/OzzyGT/qwen-image-lighting-gguf/blob/main/qwen-image-lighting-Q4_K_S.gguf",
81
+ quantization_config=GGUFQuantizationConfig(compute_dtype=torch_dtype),
82
+ torch_dtype=torch_dtype,
83
+ config="Qwen/Qwen-Image",
84
+ subfolder="transformer",
85
+ )
86
+
87
+ quantization_config = TransformersBitsAndBytesConfig(
88
+ load_in_4bit=True,
89
+ bnb_4bit_quant_type="nf4",
90
+ bnb_4bit_compute_dtype=torch.bfloat16,
91
+ )
92
+
93
+ text_encoder = Qwen2_5_VLForConditionalGeneration.from_pretrained(
94
+ model_id,
95
+ subfolder="text_encoder",
96
+ quantization_config=quantization_config,
97
+ torch_dtype=torch_dtype,
98
+ )
99
+ text_encoder = text_encoder.to("cpu")
100
+
101
+ pipe = DiffusionPipeline.from_pretrained(
102
+ model_id, transformer=transformer, text_encoder=text_encoder, torch_dtype=torch_dtype
103
+ )
104
+ pipe.enable_model_cpu_offload()
105
+ prompt = "stock photo of two people, a man and a woman, wearing lab coats writing on a white board with markers, the white board has text that reads 'The Diffusers library by Hugging Face makes it easy for developers to run image generation and inference using state-of-the-art diffusion models with just a few lines of code' with sloppy writing and traces clearly made by a human. The photo is taken from the side and has depth of field so some parts of the board looks blurred giving it a more professional look"
106
+
107
+ generator = torch.Generator(device="cuda").manual_seed(42)
108
+
109
+ image = pipe(
110
+ prompt=prompt,
111
+ negative_prompt="",
112
+ width=1664,
113
+ height=928,
114
+ num_inference_steps=8,
115
+ true_cfg_scale=1.0,
116
+ generator=generator,
117
+ ).images[0]
118
+
119
+ image.save("gguf_lighting_qwen.png")
120
+ ```
121
+
122
+ ## Result
123
+
124
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63df091910678851bb0cd0e0/I6V_CwrxkvX88NpDmjOFy.png)