wan 2.1 14b i2v 720p q6_k quantized gguf
hey guys i saw a youtube tutorial that claimed to be able to run wan 2.1 14b i2v 720p q6_k quantized gguf model on 8gb of vram successfully so i thought i would try it.
I downloaded the city96 gguf model here: https://huggingface.co/city96/Wan2.1-I2V-14B-720P-gguf?show_file_info=wan2.1-i2v-14b-720p-Q6_K.gguf
i used this workflow: https://tensor.art/workflows/83641138...
Well it worked (which i was shocked), but the output was very bad. Super glitchy/weird stuff happens.
I tried all sorts of different configurations steps 25-50, cfg 4-10, denoise 0.4-1.0, euler/uni_pc/dpmpp_2m sampler_names and simple/normal schedulers, i was able to get results without crashing my computer..but the outputs werejust so bad, i couldn't get even 1 decent output after a lot of experimentation. Any tips to get better outputs? Should I try the calcuis gguf models instead of city96? Lower quantizaiton? 480p instead of 720p?
Thanks in advance guys!
Can you post a full screenshot of the workflow? The above link you posted seems to be cut off.
There's a few things that could cause issues like that, but it's hard to troubleshoot without extra info.
@city96 are the dimensions of GGUF really accurate?
I am getting mismatch errors
While copying the parameter named "blocks.20.ffn.2.weight", whose dimensions in the model are torch.Size([5120, 13824]) and whose dimensions in the checkpoint are torch.Size([5120, 14688]), an exception occurred : ('only Tensors of floating point dtype can require gradients',).
official model
{
"has_image_input": true,
"patch_size": [1, 2, 2],
"in_dim": 36,
"dim": 5120,
"ffn_dim": 13824,
"freq_dim": 256,
"text_dim": 4096,
"out_dim": 16,
"num_heads": 40,
"num_layers": 40,
"eps": 1e-6
}
@MonsterMMORPG
I assume you are looking at the shape of the quantized data or attempting to load the quantized data into a regular nn.linear layer.
The dimensions for the gguf files are indeed correct, which you can verify on the huggingface metadata viewer:
@MonsterMMORPG
I assume you are looking at the shape of the quantized data or attempting to load the quantized data into a regular nn.linear layer.
The dimensions for the gguf files are indeed correct, which you can verify on the huggingface metadata viewer:
ok in that case that repo implementation is inaccurate ty
i used implementation from here : https://github.com/modelscope/DiffSynth-Engine/blob/bc3824d027526779c72f04ec9b7bd39f861eac2b/diffsynth_engine/utils/gguf.py#L9
I haven't heard about that specific backend before, but it looks like they have added support here: https://github.com/modelscope/DiffSynth-Engine/pull/21
(including test cases that use the models from the T2V version of this repo)
I assume you've updated to the latest version already? You could also try the models defined as the test case(s) in the PR above, as a sanity check:
Hey
@city96
ive managed to get the new ltx13b and quants to work, its just a workaround so far, but im uploading the ggufs already (;
https://huggingface.co/wsbagnsv1/ltxv-13b-0.9.7-dev-GGUF