QuantStack/Wan2.1-VACE-14B-GGUF · Request for GGUF quantization of 'Wan2.1-Fun-V1.1-14B-Control-Camera' model

11 days ago

Could you also perform GGUF quantization for this model? ComfyUI already provides native node support, but there have been no available GGUF models for use. I would greatly appreciate it!

https://huggingface.co/alibaba-pai/Wan2.1-Fun-V1.1-14B-Control-Camera

wsbagnsv1

QuantStack org 11 days ago

Ill try it once i ironed out the bugs in my custom node (;

wsbagnsv1

QuantStack org 11 days ago

I got it to apply vace patch in runtime, at least to safetensors models in native, ill do more testing though, maybe this can be applied here too!

wsbagnsv1

QuantStack org 8 days ago

okay ill try to tackle this quant today, though at first ill upload the 1.3b version, so you can test it, if it works tell me and ill upload the full version soon!

wsbagnsv1

QuantStack org 8 days ago

Will take a few hours though

wsbagnsv1

QuantStack org 8 days ago

Nvm seems to have been successful, here is the link to the 1.3b version https://huggingface.co/QuantStack/Wan2.1-Fun-V1.1-1.3B-Control-Camera-GGUF
If it works mind give me an example worfklow too?

wsbagnsv1

QuantStack org 8 days ago

Should be online soon

makisekurisu-jp

7 days ago

•

edited 7 days ago

Nvm seems to have been successful, here is the link to the 1.3b version https://huggingface.co/QuantStack/Wan2.1-Fun-V1.1-1.3B-Control-Camera-GGUF
If it works mind give me an example worfklow too?

workflow: https://files.catbox.moe/kx8fwa.json

makisekurisu-jp

7 days ago

okay ill try to tackle this quant today, though at first ill upload the 1.3b version, so you can test it, if it works tell me and ill upload the full version soon!

The inference speed of the 1.3B Q8 model is very slow, taking approximately 130s/it on a 4070 Ti. It is unclear what is causing this issue.

makisekurisu-jp

7 days ago

The generation of this video took 40 minutes.

wsbagnsv1

QuantStack org 7 days ago

This is weird, but the generation itself and the model seem to work, ill convert the main models too then (;

wsbagnsv1

QuantStack org 7 days ago

Which exact quant do you want? Im doing other uploads too, so i can fast track that one

makisekurisu-jp

5 days ago

Which exact quant do you want? Im doing other uploads too, so i can fast track that one

I'm currently using GGUF-quantized models (such as FLF2V, VACE, and I2V), all of which are Q5_K_M. Since my VRAM is only 12GB, I choose GGUF models with a size of around 12GB.

makisekurisu-jp

5 days ago

This is weird, but the generation itself and the model seem to work, ill convert the main models too then (;

A video with 81 frames took over 40 minutes to generate.
Is this due to the lack of optimization for the Control Camera model by Comfy official implementation? I believe that if the generation speed is not optimized, the Control Camera model may lose its practical value.

wsbagnsv1

QuantStack org 5 days ago

ill try to upload the q5 today then, maybe its 1.3b specific

wsbagnsv1

QuantStack org 4 days ago

Should be online in 40mins https://huggingface.co/QuantStack/Wan2.1-Fun-V1.1-14B-Control-Camera

makisekurisu-jp

4 days ago

Should be online in 40mins https://huggingface.co/QuantStack/Wan2.1-Fun-V1.1-14B-Control-Camera

Unfortunately, the 14B model's inference speed is slower than the 1.3B model.
I believe this model is not usable for regular consumer-grade GPUs.
I suggest you consider abandoning the quantization plan for this model.

wsbagnsv1

QuantStack org 4 days ago

Ill be doing some testing of my own later, if I have the same issue ill delete it /:

wsbagnsv1

QuantStack org 2 days ago

Mind sharing your workflow? I cant find that camera control node anywhere /:

makisekurisu-jp

2 days ago

Mind sharing your workflow? I cant find that camera control node anywhere /:

https://github.com/comfyanonymous/ComfyUI/commit/c820ef950d10a6b4e4fa8ab28bc09274d563b13c

makisekurisu-jp

2 days ago

Nvm seems to have been successful, here is the link to the 1.3b version https://huggingface.co/QuantStack/Wan2.1-Fun-V1.1-1.3B-Control-Camera-GGUF
If it works mind give me an example worfklow too?

workflow: https://files.catbox.moe/kx8fwa.json

Here is.

wsbagnsv1

QuantStack org 1 day ago

Im currently testing the 14b version and so far the speed seems normal? I use causvid and sage attn and with those i can generate this video with your settings in around 2-3minutes, i have a different start picture but the output looks like this:

Quality seems bad because 1. im only using Q5 and 2. i didnt dial in causvid correctly yet but as a proof of concept it seems to work fine (;

wsbagnsv1

QuantStack org 1 day ago

Im gonna upload those soon then!

wsbagnsv1

QuantStack org 1 day ago

makisekurisu-jp

1 day ago

Im currently testing the 14b version and so far the speed seems normal? I use causvid and sage attn and with those i can generate this video with your settings in around 2-3minutes, i have a different start picture but the output looks like this:

Quality seems bad because 1. im only using Q5 and 2. i didnt dial in causvid correctly yet but as a proof of concept it seems to work fine (;

Maybe it's because your step count is set to 6—setting it to 20 steps would be much slower.

My environment is Python 3.12, Torch 2.7.0, CUDA 12.8, and Xformers 0.0.30.

wsbagnsv1

QuantStack org 1 day ago

yeah thats because im using causvid, though normal vace also takes the same amount with the same settings

makisekurisu-jp

about 22 hours ago

It seems that ComfyUI officially provides native support for the Phantom model, so GGUF quantization testing might be worth trying.

The original model size is 60GB, and to be honest, I'm not sure whether GGUF quantization would allow it to run on a 12GB VRAM setup.

https://github.com/comfyanonymous/ComfyUI/commit/5e5e46d40c94a4efb7e0921d88493c798c021d82

https://huggingface.co/bytedance-research/Phantom

wsbagnsv1

QuantStack org about 12 hours ago

the 60gb are in f32, f16 would already be 30gb and Q8 is 15 gb so pretty normal for a wan model, im already quantizing, but uploads could take a while /:

wsbagnsv1

QuantStack org about 12 hours ago

but q5 or so should easily work on a 12gb vram card, you can even run Q8 if you really want to