File size: 6,463 Bytes
7932fb7 8229f1a 6881a32 7932fb7 6881a32 aac086c 9db8a12 6881a32 a79ef2b 6881a32 fcb2ae8 5648eb0 a79ef2b 5648eb0 a79ef2b 5648eb0 aac086c 7932fb7 8229f1a 6881a32 8229f1a dff772e 7932fb7 6881a32 aac086c 6881a32 a971fa7 6881a32 a971fa7 6881a32 7932fb7 6881a32 7932fb7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 |
---
library_name: diffusers
base_model: Qwen/Qwen-Image
base_model_relation: quantized
quantized_by: AlekseyCalvin
license: apache-2.0
language:
- en
- zh
pipeline_tag: text-to-image
tags:
- nf4
- Abliterated
- Qwen2.5-VL7b-Abliterated
- instruct
- Diffusers
- Transformers
- uncensored
- text-to-image
- image-to-image
- image-generation
---
<p align="center">
<img src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/qwen_image_logo.png" width="200"/>
<p>
# QWEN-IMAGE Model |nf4|+Abliterated Qwen2.5VL-7b
This repo contains a variant of QWEN's **[QWEN-IMAGE](https://huggingface.co/Qwen/Qwen-Image)**, the state-of-the-art generative model with extensive and (image/)text-to-image &/or instruction/control-editing capabilities. <br>
To make these cutting edge capabilities more accessible to those constrained to low-end consumer-grade hardware, **we've quantized the DiT (Diffusion Transformer) component of Qwen-Image to the 4-bit NF4 format** using the Bits&Bytes toolkit.<br>
This optimization was derived by us directly from the BF16 base model weights released on 08/04/2025, with no other mix-ins or modifications to the DiT component. <br>
*NOTE: Install `bitsandbytes` prior to inference.* <br>
**QWEN-IMAGE** is an open-weights customization-friendly frontier model released under the highly permissive Apache 2.0 license, welcoming unrestricted (within legal limits) commercial, experimental, artistic, academic, and other uses &/or modifications. <br>
To help highlight horizons of possibility broadened by the **QWEN-IMAGE** release, our quantization is bundled with an "Abliterated" (aka de-censored) finetune of [Qwen2.5-VL 7B Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct), QWEN-IMAGE model's sole conditioning encoder (of prompts, instructions, input images, controls, etc), as well as a powerful Vision-Language-Model in its own right. <br>
As such, our repo saddles a lean & prim NF4 DiT over the **[Qwen2.5-VL-7B-Abliterated-Caption-it](https://huggingface.co/prithivMLmods/Qwen2.5-VL-7B-Abliterated-Caption-it/tree/main)** by [Prithiv Sakthi](https://huggingface.co/prithivMLmods) (aka [prithivMLmods](https://github.com/prithivsakthiur)).
<p align="center">
<img src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/merge3.jpg" width="1600"/>
<p>
# NOTICE:
*Do not be alarmed by the file warning from the ClamAV automated checker.* <br>
*It is a clear false positive.* *In assessing one of the typical Diffusers-adapted Safetensors shards (model weights), the checker reads:*
``The following viruses have been found: Pickle.Malware.SysAccess.sys.STACK_GLOBAL.UNOFFICIAL`` <br>
*However, a Safetensors by its sheer design can not contain suchlike inserts. You may confirm for yourself thru HF's built-in weight/index viewer. <br>
So, to be sure, this repo does **not** contain any pickle checkpoints, or any other pickled data.* <br>
# TEXT-TO-IMAGE PIPELINE EXAMPLE:
This repo is formatted for usage with Diffusers (0.35.0.dev0+) & Transformers libraries, vis-a-vis associated pipelines & model component classes, such as the defaults listed in `model_index.json` (in this repo's root folder). <br>
*Sourced/adapted from [the original base model repo](https://huggingface.co/Qwen/Qwen-Image) by QWEN.*
**EDIT:
We've confronted some issues with using the below pipeline. Will update once reliable adjustments are confirmed.** <br>
```python
from diffusers import DiffusionPipeline
import torch
import bitsandbytes
model_name = "AlekseyCalvin/QwenImage_nf4"
# Load the pipeline
if torch.cuda.is_available():
torch_dtype = torch.bfloat16
device = "cuda"
else:
torch_dtype = torch.float32
device = "cpu"
pipe = DiffusionPipeline.from_pretrained(model_name, torch_dtype=torch_dtype)
pipe = pipe.to(device)
positive_magic = [
"en": "Ultra HD, 4K, cinematic composition." # for english prompt,
"zh": "超清,4K,电影级构图" # for chinese prompt,
]
# Generate image
prompt = '''A coffee shop entrance features a chalkboard sign reading "Qwen Coffee 😊 $2 per cup," with a neon light beside it displaying "通义千问". Next to it hangs a poster showing a beautiful Chinese woman, and beneath the poster is written "π≈3.1415926-53589793-23846264-33832795-02384197". Ultra HD, 4K, cinematic composition'''
negative_prompt = " "
# Generate with different aspect ratios
aspect_ratios = {
"1:1": (1328, 1328),
"16:9": (1664, 928),
"9:16": (928, 1664),
"4:3": (1472, 1140),
"3:4": (1140, 1472)
}
width, height = aspect_ratios["16:9"]
image = pipe(
prompt=prompt + positive_magic["en"],
negative_prompt=negative_prompt,
width=width,
height=height,
num_inference_steps=50,
true_cfg_scale=4.0,
generator=torch.Generator(device="cuda").manual_seed(42)
).images[0]
image.save("example.png")
```
<br>
# SHOWCASES FROM THE QWEN TEAM:



# MORE INFO:
- Check out the [Technical Report](https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/Qwen_Image.pdf) for QWEN-IMAGE, released by the Qwen team! <br>
- Find source base model weights here at [huggingface](https://huggingface.co/Qwen/Qwen-Image) and at [Modelscope](https://modelscope.cn/models/Qwen/Qwen-Image).
## QWEN LINKS:
<p align="center">
💜 <a href="https://chat.qwen.ai/"><b>Qwen Chat</b></a>   |   🤗 <a href="https://huggingface.co/Qwen/Qwen-Image">Hugging Face</a>   |   🤖 <a href="https://modelscope.cn/models/Qwen/Qwen-Image">ModelScope</a>   |    📑 <a href="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/Qwen_Image.pdf">Tech Report</a>    |    📑 <a href="https://qwenlm.github.io/blog/qwen-image/">Blog</a>   
<br>
🖥️ <a href="https://huggingface.co/spaces/Qwen/qwen-image">Demo</a>   |   💬 <a href="https://github.com/QwenLM/Qwen-Image/blob/main/assets/wechat.png">WeChat (微信)</a>   |   🫨 <a href="https://discord.gg/CV4E9rpNSD">Discord</a>  
</p>
## QWEN-IMAGE TECHNICAL REPORT CITATION:
```bibtex
@article{qwen-image,
title={Qwen-Image Technical Report},
author={Qwen Team},
journal={arXiv preprint},
year={2025}
}
```
|