AlekseyCalvin commited on
Commit
d81fe2f
·
verified ·
1 Parent(s): 31bd9cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +121 -194
README.md CHANGED
@@ -1,198 +1,125 @@
1
  ---
2
  library_name: diffusers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
- # Model Card for Model ID
6
-
7
- <!-- Provide a quick summary of what the model is/does. -->
8
-
9
-
10
-
11
- ## Model Details
12
-
13
- ### Model Description
14
-
15
- <!-- Provide a longer summary of what this model is. -->
16
-
17
- This is the model card of a 🧨 diffusers pipeline that has been pushed on the Hub. This model card has been automatically generated.
18
-
19
- - **Developed by:** [More Information Needed]
20
- - **Funded by [optional]:** [More Information Needed]
21
- - **Shared by [optional]:** [More Information Needed]
22
- - **Model type:** [More Information Needed]
23
- - **Language(s) (NLP):** [More Information Needed]
24
- - **License:** [More Information Needed]
25
- - **Finetuned from model [optional]:** [More Information Needed]
26
-
27
- ### Model Sources [optional]
28
-
29
- <!-- Provide the basic links for the model. -->
30
-
31
- - **Repository:** [More Information Needed]
32
- - **Paper [optional]:** [More Information Needed]
33
- - **Demo [optional]:** [More Information Needed]
34
-
35
- ## Uses
36
-
37
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
38
-
39
- ### Direct Use
40
-
41
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
42
-
43
- [More Information Needed]
44
-
45
- ### Downstream Use [optional]
46
-
47
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
48
-
49
- [More Information Needed]
50
-
51
- ### Out-of-Scope Use
52
-
53
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
54
-
55
- [More Information Needed]
56
-
57
- ## Bias, Risks, and Limitations
58
-
59
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
60
-
61
- [More Information Needed]
62
-
63
- ### Recommendations
64
-
65
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
66
-
67
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
68
-
69
- ## How to Get Started with the Model
70
-
71
- Use the code below to get started with the model.
72
-
73
- [More Information Needed]
74
-
75
- ## Training Details
76
-
77
- ### Training Data
78
-
79
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
80
-
81
- [More Information Needed]
82
-
83
- ### Training Procedure
84
-
85
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
86
-
87
- #### Preprocessing [optional]
88
-
89
- [More Information Needed]
90
-
91
-
92
- #### Training Hyperparameters
93
-
94
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
95
-
96
- #### Speeds, Sizes, Times [optional]
97
-
98
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
99
-
100
- [More Information Needed]
101
-
102
- ## Evaluation
103
-
104
- <!-- This section describes the evaluation protocols and provides the results. -->
105
-
106
- ### Testing Data, Factors & Metrics
107
-
108
- #### Testing Data
109
-
110
- <!-- This should link to a Dataset Card if possible. -->
111
-
112
- [More Information Needed]
113
-
114
- #### Factors
115
-
116
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
117
-
118
- [More Information Needed]
119
-
120
- #### Metrics
121
-
122
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
123
-
124
- [More Information Needed]
125
-
126
- ### Results
127
-
128
- [More Information Needed]
129
-
130
- #### Summary
131
-
132
-
133
-
134
- ## Model Examination [optional]
135
-
136
- <!-- Relevant interpretability work for the model goes here -->
137
-
138
- [More Information Needed]
139
-
140
- ## Environmental Impact
141
-
142
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
143
-
144
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
145
-
146
- - **Hardware Type:** [More Information Needed]
147
- - **Hours used:** [More Information Needed]
148
- - **Cloud Provider:** [More Information Needed]
149
- - **Compute Region:** [More Information Needed]
150
- - **Carbon Emitted:** [More Information Needed]
151
-
152
- ## Technical Specifications [optional]
153
-
154
- ### Model Architecture and Objective
155
-
156
- [More Information Needed]
157
-
158
- ### Compute Infrastructure
159
-
160
- [More Information Needed]
161
-
162
- #### Hardware
163
-
164
- [More Information Needed]
165
-
166
- #### Software
167
-
168
- [More Information Needed]
169
-
170
- ## Citation [optional]
171
-
172
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
173
-
174
- **BibTeX:**
175
-
176
- [More Information Needed]
177
-
178
- **APA:**
179
-
180
- [More Information Needed]
181
-
182
- ## Glossary [optional]
183
-
184
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
185
-
186
- [More Information Needed]
187
-
188
- ## More Information [optional]
189
-
190
- [More Information Needed]
191
-
192
- ## Model Card Authors [optional]
193
-
194
- [More Information Needed]
195
-
196
- ## Model Card Contact
197
-
198
- [More Information Needed]
 
1
  ---
2
  library_name: diffusers
3
+ base_model: Qwen/Qwen-Image
4
+ base_model_relation: quantized
5
+ quantized_by: AlekseyCalvin
6
+ license: apache-2.0
7
+ language:
8
+ - en
9
+ - zh
10
+ pipeline_tag: text-to-image
11
+ tags:
12
+ - fp4
13
+ - Abliterated
14
+ - quantized
15
+ - 4-bit
16
+ - Qwen2.5-VL7b-Abliterated
17
+ - instruct
18
+ - Diffusers
19
+ - Transformers
20
+ - uncensored
21
+ - text-to-image
22
+ - image-to-image
23
+ - image-generation
24
  ---
25
+ <p align="center">
26
+ <img src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/qwen_image_logo.png" width="200"/>
27
+ <p>
28
+
29
+ # QWEN-IMAGE Model |fp4|+Abliterated Qwen2.5VL-7b
30
+ This repo contains a variant of QWEN's **[QWEN-IMAGE](https://huggingface.co/Qwen/Qwen-Image)**, the state-of-the-art generative model with extensive and (image/)text-to-image &/or instruction/control-editing capabilities. <br>
31
+
32
+ To make these cutting edge capabilities more accessible to those constrained to low-end consumer-grade hardware, **we've quantized the DiT (Diffusion Transformer) component of Qwen-Image to the 4-bit FP4 format** using the Bits&Bytes toolkit.<br>
33
+ This optimization was derived by us directly from the BF16 base model weights released on 08/04/2025, with no other mix-ins or modifications to the DiT component. <br>
34
+ *NOTE: Install `bitsandbytes` prior to inference.* <br>
35
+
36
+ **QWEN-IMAGE** is an open-weights customization-friendly frontier model released under the highly permissive Apache 2.0 license, welcoming unrestricted (within legal limits) commercial, experimental, artistic, academic, and other uses &/or modifications. <br>
37
+
38
+ To help highlight horizons of possibility broadened by the **QWEN-IMAGE** release, our quantization is bundled with an "Abliterated" (aka de-censored) finetune of [Qwen2.5-VL 7B Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct), QWEN-IMAGE model's sole conditioning encoder (of prompts, instructions, input images, controls, etc), as well as a powerful Vision-Language-Model in its own right. <br>
39
+
40
+ As such, our repo saddles a lean & prim FP4 DiT over the **[Qwen2.5-VL-7B-Abliterated-Caption-it](https://huggingface.co/prithivMLmods/Qwen2.5-VL-7B-Abliterated-Caption-it/tree/main)** by [Prithiv Sakthi](https://huggingface.co/prithivMLmods) (aka [prithivMLmods](https://github.com/prithivsakthiur)).
41
+ <p align="center">
42
+ <img src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/merge3.jpg" width="1600"/>
43
+ <p>
44
+
45
+ # NOTICE:
46
+ *Do not be alarmed by the file warning from the ClamAV automated checker.* <br>
47
+ *It is a clear false positive.* *In assessing one of the typical Diffusers-adapted Safetensors shards (model weights), the checker reads:*
48
+ ``The following viruses have been found: Pickle.Malware.SysAccess.sys.STACK_GLOBAL.UNOFFICIAL`` <br>
49
+ *However, a Safetensors by its sheer design can not contain suchlike inserts. You may confirm for yourself thru HF's built-in weight/index viewer. <br>
50
+ So, to be sure, this repo does **not** contain any pickle checkpoints, or any other pickled data.* <br>
51
+
52
+ # TEXT-TO-IMAGE PIPELINE EXAMPLE:
53
+ This repo is formatted for usage with Diffusers (0.35.0.dev0+) & Transformers libraries, vis-a-vis associated pipelines & model component classes, such as the defaults listed in `model_index.json` (in this repo's root folder). <br>
54
+ *Sourced/adapted from [the original base model repo](https://huggingface.co/Qwen/Qwen-Image) by QWEN.*
55
+ **EDIT:
56
+ We've confronted some issues with using the below pipeline. Will update once a reliable replacement is confirmed.** <br>
57
+ ```python
58
+ from diffusers import DiffusionPipeline
59
+ import torch
60
+ import bitsandbytes
61
+ model_name = "AlekseyCalvin/QwenImage_fp4_diffusers"
62
+ # Load the pipeline
63
+ if torch.cuda.is_available():
64
+ torch_dtype = torch.bfloat16
65
+ device = "cuda"
66
+ else:
67
+ torch_dtype = torch.float32
68
+ device = "cpu"
69
+ pipe = DiffusionPipeline.from_pretrained(model_name, torch_dtype=torch_dtype)
70
+ pipe = pipe.to(device)
71
+ positive_magic = [
72
+ "en": "Ultra HD, 4K, cinematic composition." # for english prompt,
73
+ "zh": "超清,4K,电影级构图" # for chinese prompt,
74
+ ]
75
+ # Generate image
76
+ prompt = '''A coffee shop entrance features a chalkboard sign reading "Qwen Coffee 😊 $2 per cup," with a neon light beside it displaying "通义千问". Next to it hangs a poster showing a beautiful Chinese woman, and beneath the poster is written "π≈3.1415926-53589793-23846264-33832795-02384197". Ultra HD, 4K, cinematic composition'''
77
+ negative_prompt = " "
78
+ # Generate with different aspect ratios
79
+ aspect_ratios = {
80
+ "1:1": (1328, 1328),
81
+ "16:9": (1664, 928),
82
+ "9:16": (928, 1664),
83
+ "4:3": (1472, 1140),
84
+ "3:4": (1140, 1472)
85
+ }
86
+ width, height = aspect_ratios["16:9"]
87
+ image = pipe(
88
+ prompt=prompt + positive_magic["en"],
89
+ negative_prompt=negative_prompt,
90
+ width=width,
91
+ height=height,
92
+ num_inference_steps=50,
93
+ true_cfg_scale=4.0,
94
+ generator=torch.Generator(device="cuda").manual_seed(42)
95
+ ).images[0]
96
+ image.save("example.png")
97
+ ```
98
+ <br>
99
+
100
+ # SHOWCASES FROM THE QWEN TEAM:
101
+ ![](https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/s1.jpg#center)
102
+ ![](https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/s3.jpg#center)
103
+ ![](https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/s2.jpg#center)
104
+
105
+ # MORE INFO:
106
+ - Check out the [Technical Report](https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/Qwen_Image.pdf) for QWEN-IMAGE, released by the Qwen team! <br>
107
+ - Find source base model weights here at [huggingface](https://huggingface.co/Qwen/Qwen-Image) and at [Modelscope](https://modelscope.cn/models/Qwen/Qwen-Image).
108
+
109
+ ## QWEN LINKS:
110
+ <p align="center">
111
+ 💜 <a href="https://chat.qwen.ai/"><b>Qwen Chat</b></a>&nbsp&nbsp | &nbsp&nbsp🤗 <a href="https://huggingface.co/Qwen/Qwen-Image">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/models/Qwen/Qwen-Image">ModelScope</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/Qwen_Image.pdf">Tech Report</a> &nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://qwenlm.github.io/blog/qwen-image/">Blog</a> &nbsp&nbsp
112
+ <br>
113
+ 🖥️ <a href="https://huggingface.co/spaces/Qwen/qwen-image">Demo</a>&nbsp&nbsp | &nbsp&nbsp💬 <a href="https://github.com/QwenLM/Qwen-Image/blob/main/assets/wechat.png">WeChat (微信)</a>&nbsp&nbsp | &nbsp&nbsp🫨 <a href="https://discord.gg/CV4E9rpNSD">Discord</a>&nbsp&nbsp
114
+ </p>
115
+
116
+ ## QWEN-IMAGE TECHNICAL REPORT CITATION:
117
+ ```bibtex
118
+ @article{qwen-image,
119
+ title={Qwen-Image Technical Report},
120
+ author={Qwen Team},
121
+ journal={arXiv preprint},
122
+ year={2025}
123
+ }
124
+ ```
125