AlekseyCalvin commited on
Commit
6881a32
·
verified ·
1 Parent(s): cc7bb50

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +92 -8
README.md CHANGED
@@ -7,23 +7,107 @@ license: apache-2.0
7
  language:
8
  - en
9
  - zh
10
- - nf4
11
  pipeline_tag: text-to-image
 
 
 
 
 
 
 
 
 
 
 
12
  ---
 
 
 
 
 
 
 
 
 
 
13
 
14
- # QWEN-IMAGE MODEL
15
- This repo is a quantization of the [Qwen-Image model by Qwen](https://huggingface.co/Qwen/Qwen-Image). <br>
16
- **DiT (Diffusion Transformer) quantized to NF4 usings BitsAndBytes**
17
- All other components in BF16.
18
 
19
  # NOTICE:
20
  *Do not be alarmed by the file warning from the ClamAV automated checker.* <br>
21
- *It is a clear false positive.*
22
- *In assessing one of the typical Diffusers-adapted Safetensors shards (model weights), the checker reads:*
23
  ``The following viruses have been found: Pickle.Malware.SysAccess.sys.STACK_GLOBAL.UNOFFICIAL`` <br>
24
  *However, a Safetensors can not contain suchlike inserts. <br>
25
- You may confirm for yourself through HF's built-in utility weight tensor index scanner/viewer. <br>
26
  To be sure, this repo does **not** contain any pickle checkpoints, or any other pickled data.* <br>
27
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
 
 
 
 
 
 
 
 
 
29
 
 
7
  language:
8
  - en
9
  - zh
 
10
  pipeline_tag: text-to-image
11
+ tags:
12
+ - nf4
13
+ - Abliterated
14
+ - Qwen2.5-VL7b-Abliterated
15
+ - instruct
16
+ - Diffusers
17
+ - Transformers
18
+ - uncensored
19
+ - text-to-image
20
+ - image-to-image
21
+ - image-generation
22
  ---
23
+ <p align="center">
24
+ <img src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/qwen_image_logo.png" width="200"/>
25
+ <p>
26
+ # QWEN-IMAGE MODEL (NF4) w/Abliterated Qwen2.5-VL-7B
27
+ This repo contains a variant of QWEN's **[QWEN-IMAGE](https://huggingface.co/Qwen/Qwen-Image)**, the state-of-the-art generative model with extensive and (image/)text-to-image &/or instruction/control-editing capabilities. <br>
28
+ To make these cutting edge capabilities more accessible to those constrained to low-end consumer-grade hardware, **we've quantized the DiT (Diffusion Transformer) component of Qwen-Image to the 4-bit NF4 format** using the Bits&Bytes toolkit.<br>
29
+ This optimization was derived by us directly from the BF16 base model weights released on 08/04/2025, with no other mix-ins or modifications to the DiT component. <br>
30
+ QWEN-IMAGE is an open-weights customization-friendly frontier model released under the highly permissive Apache 2.0 license, welcoming unrestricted commercial use &/or modification. <br>
31
+ To further highlight the horizons of possibility broadened by the release of QWEN-IMAGE, our quantization of it is bundled with an "Abliterated" (aka de-censored) finetune of [Qwen2.5-VL 7B Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct), QWEN-IMAGE model's sole conditioning encoder (of prompts, instructions, input images, controls, etc), as well as a powerful Vision-Language-Model in its own right. <br>
32
+ As such, our repo saddles a lean & prim NF4 DiT over the [Qwen2.5-VL-7B-Abliterated-Caption-it](https://huggingface.co/prithivMLmods/Qwen2.5-VL-7B-Abliterated-Caption-it/tree/main) by the [Prithiv Sakthi](https://huggingface.co/prithivMLmods) (aka [prithivMLmods](https://github.com/prithivsakthiur)).
33
 
34
+ This repo is formatted for usage with Diffusers (0.35.0.dev0+) & Transformers libraries, vis-a-vis associated pipelines & model component classes, such as the defaults listed in `model_index.json` (in this repo's root folder). <br>
 
 
 
35
 
36
  # NOTICE:
37
  *Do not be alarmed by the file warning from the ClamAV automated checker.* <br>
38
+ *It is a clear false positive.* *In assessing one of the typical Diffusers-adapted Safetensors shards (model weights), the checker reads:*
 
39
  ``The following viruses have been found: Pickle.Malware.SysAccess.sys.STACK_GLOBAL.UNOFFICIAL`` <br>
40
  *However, a Safetensors can not contain suchlike inserts. <br>
41
+ You may confirm for yourself through HF's built-in utility weight/tensor index =viewer. <br>
42
  To be sure, this repo does **not** contain any pickle checkpoints, or any other pickled data.* <br>
43
 
44
+ # TEXT-TO-IMAGE PIPELINE EXAMPLE:
45
+ *Sourced/adapted from [the original base model repo](https://huggingface.co/Qwen/Qwen-Image) by QWEN.*
46
+ ```python
47
+ from diffusers import DiffusionPipeline
48
+ import torch
49
+ model_name = "AlekseyCalvin/QwenImage_nf4"
50
+ # Load the pipeline
51
+ if torch.cuda.is_available():
52
+ torch_dtype = torch.bfloat16
53
+ device = "cuda"
54
+ else:
55
+ torch_dtype = torch.float32
56
+ device = "cpu"
57
+ pipe = DiffusionPipeline.from_pretrained(model_name, torch_dtype=torch_dtype)
58
+ pipe = pipe.to(device)
59
+ positive_magic = [
60
+ "en": "Ultra HD, 4K, cinematic composition." # for english prompt,
61
+ "zh": "超清,4K,电影级构图" # for chinese prompt,
62
+ ]
63
+ # Generate image
64
+ prompt = '''A coffee shop entrance features a chalkboard sign reading "Qwen Coffee 😊 $2 per cup," with a neon light beside it displaying "通义千问". Next to it hangs a poster showing a beautiful Chinese woman, and beneath the poster is written "π≈3.1415926-53589793-23846264-33832795-02384197". Ultra HD, 4K, cinematic composition'''
65
+ negative_prompt = " "
66
+ # Generate with different aspect ratios
67
+ aspect_ratios = {
68
+ "1:1": (1328, 1328),
69
+ "16:9": (1664, 928),
70
+ "9:16": (928, 1664),
71
+ "4:3": (1472, 1140),
72
+ "3:4": (1140, 1472)
73
+ }
74
+ width, height = aspect_ratios["16:9"]
75
+ image = pipe(
76
+ prompt=prompt + positive_magic["en"],
77
+ negative_prompt=negative_prompt,
78
+ width=width,
79
+ height=height,
80
+ num_inference_steps=50,
81
+ true_cfg_scale=4.0,
82
+ generator=torch.Generator(device="cuda").manual_seed(42)
83
+ ).images[0]
84
+ image.save("example.png")
85
+ ```
86
+ <br>
87
+
88
+ # SHOWCASES FROM THE QWEN TEAM:
89
+ ![](https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/s1.jpg#center)
90
+ ![](https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/s3.jpg#center)
91
+ ![](https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/s2.jpg#center)
92
+
93
+ # MORE INFO:
94
+ - Check out the [Technical Report](https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/Qwen_Image.pdf) for QWEN-IMAGE, released by the Qwen team! <br>
95
+ - Find source base model weights here at [huggingface](https://huggingface.co/Qwen/Qwen-Image) and at [Modelscope](https://modelscope.cn/models/Qwen/Qwen-Image).
96
+
97
+ ## QWEN LINKS:
98
+ <p align="center">
99
+ 💜 <a href="https://chat.qwen.ai/"><b>Qwen Chat</b></a>&nbsp&nbsp | &nbsp&nbsp🤗 <a href="https://huggingface.co/Qwen/Qwen-Image">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/models/Qwen/Qwen-Image">ModelScope</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/Qwen_Image.pdf">Tech Report</a> &nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://qwenlm.github.io/blog/qwen-image/">Blog</a> &nbsp&nbsp
100
+ <br>
101
+ 🖥️ <a href="https://huggingface.co/spaces/Qwen/qwen-image">Demo</a>&nbsp&nbsp | &nbsp&nbsp💬 <a href="https://github.com/QwenLM/Qwen-Image/blob/main/assets/wechat.png">WeChat (微信)</a>&nbsp&nbsp | &nbsp&nbsp🫨 <a href="https://discord.gg/CV4E9rpNSD">Discord</a>&nbsp&nbsp
102
+ </p>
103
 
104
+ ## QWEN-IMAGE TECHNICAL REPORT CITATION:
105
+ ```bibtex
106
+ @article{qwen-image,
107
+ title={Qwen-Image Technical Report},
108
+ author={Qwen Team},
109
+ journal={arXiv preprint},
110
+ year={2025}
111
+ }
112
+ ```
113