mengqin1 commited on
Commit
0b3aad7
·
verified ·
1 Parent(s): 66eff4b

Fix some typo

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -17,7 +17,7 @@ This is a quantized version of the original [RediDream NSFW I1 model](https://ci
17
  - `nf4` (NormalFloat4) quantization
18
  - `fp4` (Float4) quantization
19
 
20
- Both are saved in the `safetensors` format and have smaller file sizes than the official GGUF-quantized NF4 version (9.6 GB vs 10.7 GB).
21
 
22
  ## Quantization Method
23
 
@@ -59,13 +59,13 @@ To load the 4‑bit quantized HiDream I1 model in ComfyUI using the custom loade
59
  ## Quick Q&A
60
 
61
  **1. What’s the difference from the official NF4?**
62
- There’s no fundamental difference—the official model is in GGUF format, while ours is in safetensors format. Our files are slightly smaller (9.6 GB vs. 10.7 GB).
63
 
64
  **2. How’s the performance?**
65
- The quantized model requires more conservative settings (e.g. skipping layers 2–6, 15–25 steps, CFG 1.0) to produce high-quality images, so generation may be a bit slower.
66
 
67
- **3. How does it compare with GGUF’s Q4_K?**
68
- bnb‑4bit performs similarly to Q4_K, sometimes slightly weaker, but with a smaller footprint. The NF4 variant can even run faster on certain hardware (with proper software support). Additionally, ComfyUI’s GGUF loader appears to have memory-leak issues, whereas FP4/NF4 loading has not exhibited this problem.
69
 
70
  ## Postscript
71
 
 
17
  - `nf4` (NormalFloat4) quantization
18
  - `fp4` (Float4) quantization
19
 
20
+ Both are saved in the `safetensors` format and have smaller file sizes than the official GGUF-quantized NF4’(in fact it's a Q4_K_M quantized model not NF4) version (9.6 GB vs 10.7 GB).
21
 
22
  ## Quantization Method
23
 
 
59
  ## Quick Q&A
60
 
61
  **1. What’s the difference from the official NF4?**
62
+ The official RediDream NF4 gguf version is actually a gguf quantized version of Q4_K_M, not a real bnb-nf4. So the official version is slightly larger than ours (10.7GB vs 9.6GB), and the effect is better. Our NF4 model needs about 25+ steps to reach the level of the official NF4, and each step is 30% slower on average. But our video memory usage is also slightly smaller than the official one (11.6GB vs 13.3 GB). Overall, the level of our model is almost the same as the Q4_0 quantization.
63
 
64
  **2. How’s the performance?**
65
+ The quantized model requires more conservative settings (e.g. skipping layers 2–6, 15–25 steps, CFG 1.0) to produce high-quality images, so generation may be a bit slower(single step 1.5s vs 1.1 s on RTX 4080).
66
 
67
+ **3. How does it compare with GGUF’s Q4_0?**
68
+ bnb‑4bit performs similarly to Q4_0, sometimes slightly weaker, but with a smaller footprint. The NF4 variant can even run faster on certain hardware (with proper software support). Additionally, ComfyUI’s GGUF loader appears to have memory-leak issues, whereas FP4/NF4 loading has not exhibited this problem.
69
 
70
  ## Postscript
71