Request for GGUF quantization of 'Wan2.1-Fun-V1.1-14B-Control-Camera' model
Could you also perform GGUF quantization for this model? ComfyUI already provides native node support, but there have been no available GGUF models for use. I would greatly appreciate it!
https://huggingface.co/alibaba-pai/Wan2.1-Fun-V1.1-14B-Control-Camera
Ill try it once i ironed out the bugs in my custom node (;
I got it to apply vace patch in runtime, at least to safetensors models in native, ill do more testing though, maybe this can be applied here too!
okay ill try to tackle this quant today, though at first ill upload the 1.3b version, so you can test it, if it works tell me and ill upload the full version soon!
Will take a few hours though
Nvm seems to have been successful, here is the link to the 1.3b version https://huggingface.co/QuantStack/Wan2.1-Fun-V1.1-1.3B-Control-Camera-GGUF
If it works mind give me an example worfklow too?
Should be online soon
Nvm seems to have been successful, here is the link to the 1.3b version https://huggingface.co/QuantStack/Wan2.1-Fun-V1.1-1.3B-Control-Camera-GGUF
If it works mind give me an example worfklow too?
workflow: https://files.catbox.moe/kx8fwa.json
okay ill try to tackle this quant today, though at first ill upload the 1.3b version, so you can test it, if it works tell me and ill upload the full version soon!
The inference speed of the 1.3B Q8 model is very slow, taking approximately 130s/it on a 4070 Ti. It is unclear what is causing this issue.
The generation of this video took 40 minutes.
This is weird, but the generation itself and the model seem to work, ill convert the main models too then (;
Which exact quant do you want? Im doing other uploads too, so i can fast track that one
Which exact quant do you want? Im doing other uploads too, so i can fast track that one
I'm currently using GGUF-quantized models (such as FLF2V, VACE, and I2V), all of which are Q5_K_M. Since my VRAM is only 12GB, I choose GGUF models with a size of around 12GB.
This is weird, but the generation itself and the model seem to work, ill convert the main models too then (;
A video with 81 frames took over 40 minutes to generate.
Is this due to the lack of optimization for the Control Camera model by Comfy official implementation? I believe that if the generation speed is not optimized, the Control Camera model may lose its practical value.
ill try to upload the q5 today then, maybe its 1.3b specific
Should be online in 40mins https://huggingface.co/QuantStack/Wan2.1-Fun-V1.1-14B-Control-Camera
Unfortunately, the 14B model's inference speed is slower than the 1.3B model.
I believe this model is not usable for regular consumer-grade GPUs.
I suggest you consider abandoning the quantization plan for this model.
Ill be doing some testing of my own later, if I have the same issue ill delete it /:
Mind sharing your workflow? I cant find that camera control node anywhere /:
Mind sharing your workflow? I cant find that camera control node anywhere /:
https://github.com/comfyanonymous/ComfyUI/commit/c820ef950d10a6b4e4fa8ab28bc09274d563b13c
Nvm seems to have been successful, here is the link to the 1.3b version https://huggingface.co/QuantStack/Wan2.1-Fun-V1.1-1.3B-Control-Camera-GGUF
If it works mind give me an example worfklow too?workflow: https://files.catbox.moe/kx8fwa.json
Here is.
Im currently testing the 14b version and so far the speed seems normal? I use causvid and sage attn and with those i can generate this video with your settings in around 2-3minutes, i have a different start picture but the output looks like this:
Quality seems bad because 1. im only using Q5 and 2. i didnt dial in causvid correctly yet but as a proof of concept it seems to work fine (;
Im gonna upload those soon then!
Im currently testing the 14b version and so far the speed seems normal? I use causvid and sage attn and with those i can generate this video with your settings in around 2-3minutes, i have a different start picture but the output looks like this:
Quality seems bad because 1. im only using Q5 and 2. i didnt dial in causvid correctly yet but as a proof of concept it seems to work fine (;
Maybe it's because your step count is set to 6βsetting it to 20 steps would be much slower.
My environment is Python 3.12, Torch 2.7.0, CUDA 12.8, and Xformers 0.0.30.
yeah thats because im using causvid, though normal vace also takes the same amount with the same settings
It seems that ComfyUI officially provides native support for the Phantom model, so GGUF quantization testing might be worth trying.
The original model size is 60GB, and to be honest, I'm not sure whether GGUF quantization would allow it to run on a 12GB VRAM setup.
https://github.com/comfyanonymous/ComfyUI/commit/5e5e46d40c94a4efb7e0921d88493c798c021d82
the 60gb are in f32, f16 would already be 30gb and Q8 is 15 gb so pretty normal for a wan model, im already quantizing, but uploads could take a while /:
but q5 or so should easily work on a 12gb vram card, you can even run Q8 if you really want to