Update README.md
Browse files
README.md
CHANGED
|
@@ -32,11 +32,12 @@ library_name: diffusers
|
|
| 32 |
<div style="display: flex; justify-content: center; align-items: center; text-align: center;">
|
| 33 |
<a href="https://arxiv.org/abs/2411.05007">[Paper]</a> 
|
| 34 |
<a href='https://github.com/mit-han-lab/nunchaku'>[Code]</a> 
|
|
|
|
| 35 |
<a href='https://hanlab.mit.edu/projects/svdquant'>[Website]</a> 
|
| 36 |
<a href='https://hanlab.mit.edu/blog/svdquant'>[Blog]</a>
|
| 37 |
</div>
|
| 38 |
|
| 39 |
-

|
|
| 78 |
```
|
| 79 |
|
| 80 |
### Comfy UI
|
| 81 |
-
|
| 82 |
-

|
| 83 |
-
Please check [comfyui/README.md](comfyui/README.md) for the usage.
|
| 84 |
|
| 85 |
## Limitations
|
| 86 |
|
|
|
|
| 32 |
<div style="display: flex; justify-content: center; align-items: center; text-align: center;">
|
| 33 |
<a href="https://arxiv.org/abs/2411.05007">[Paper]</a> 
|
| 34 |
<a href='https://github.com/mit-han-lab/nunchaku'>[Code]</a> 
|
| 35 |
+
<a href='https://svdquant.mit.edu'>[Demo]</a> 
|
| 36 |
<a href='https://hanlab.mit.edu/projects/svdquant'>[Website]</a> 
|
| 37 |
<a href='https://hanlab.mit.edu/blog/svdquant'>[Blog]</a>
|
| 38 |
</div>
|
| 39 |
|
| 40 |
+

|
| 41 |
SVDQuant is a post-training quantization technique for 4-bit weights and activations that well maintains visual fidelity. On 12B FLUX.1-dev, it achieves 3.6× memory reduction compared to the BF16 model. By eliminating CPU offloading, it offers 8.7× speedup over the 16-bit model when on a 16GB laptop 4090 GPU, 3× faster than the NF4 W4A16 baseline. On PixArt-∑, it demonstrates significantly superior visual quality over other W4A4 or even W4A8 baselines. "E2E" means the end-to-end latency including the text encoder and VAE decoder.
|
| 42 |
|
| 43 |
## Method
|
|
|
|
| 79 |
```
|
| 80 |
|
| 81 |
### Comfy UI
|
| 82 |
+
roW
|
|
|
|
|
|
|
| 83 |
|
| 84 |
## Limitations
|
| 85 |
|