Questions, hoping to develop a node for comfyui
Was the convert.py script from city96/ComfyUI-GGUF used? If so, were any modifications made to it to support the HiDreamImageTransformer2DModel architecture (e.g., adding a ModelHiDream class to convert.py)? Thank you!
not that one; you could do it with the convertor zero if you have gguf node; but should merge the safetensors first
Looks like Comfyui has made changes to support HiDream and GGUF loading. I am getting the following error on load of the q8 guff, but a similar error exists for all of them. Do you know if this is an error with the GGUF file or with the Loader?
Error(s) in loading state_dict for HiDreamImageTransformer2DModel:
While copying the parameter named "double_stream_blocks.0.block.ff_i.gate.weight", whose dimensions in the model are torch.Size([4, 2560]) and whose dimensions in the checkpoint are torch.Size([4, 2720]), an exception occurred : ('The size of tensor a (2560) must match the size of tensor b (2720) at non-singleton dimension 1',).
While copying the parameter named "double_stream_blocks.1.block.ff_i.gate.weight", whose dimensions in the model are torch.Size([4, 2560]) and whose dimensions in the checkpoint are torch.Size([4, 2720]), an exception occurred : ('The size of tensor a (2560) must match the size of tensor b (2720) at non-singleton dimension 1',). ...
how about the fp8 safetensors? the first three (q4_0, q5_0 and q8_0) might need to adjust a bit; others should work; try GGUF QuadrupleCLIP Loader with the code update later; working on still
I tried using the load diffusion model node with the fp8 safe tensors, it gives me the same error when it tries to do Ksampling. I also tried the q4_0 but it gave a very similar error, the dimension numbers were different but still a mismatch. I saw city96 is uploading ggufs as well I am downloading that to see if it works. https://huggingface.co/city96/HiDream-I1-Dev-gguf
ok, thanks; stay tuned
I tried city96's ggufs, they get passed the gguf loading stage but it gives this error on KSampling
mat1 and mat2 shapes cannot be multiplied (2x768 and 2048x2560)
I tried city96's ggufs, they get passed the gguf loading stage but it gives this error on KSampling
mat1 and mat2 shapes cannot be multiplied (2x768 and 2048x2560)
hope a fix soon
For the quants in this repo, they likely need to be re-uploaded, as the ffn gate weight gets loaded into a torch.nn.parameter, which means you have to keep it in either FP32 or FP16 to be loadable.
For my quants, I did a quick test, and they did work on the instance I was testing on. You do need all 4 text encoders from the comfy repo for it to work, though. (L8 ls just llama 8B, I made that myself before comfy finished uploading. The clip models in that repo I think are also different, as they include the projection weight, while the ones comfy uploaded with flux didn't, so may need to download those as well).
For the quants in this repo, they likely need to be re-uploaded, as the ffn gate weight gets loaded into a torch.nn.parameter, which means you have to keep it in either FP32 or FP16 to be loadable.
For my quants, I did a quick test, and they did work on the instance I was testing on. You do need all 4 text encoders from the comfy repo for it to work, though. (L8 ls just llama 8B, I made that myself before comfy finished uploading. The clip models in that repo I think are also different, as they include the projection weight, while the ones comfy uploaded with flux didn't, so may need to download those as well).
thanks
Which one is recommended after testing, full or dev? 24G video memory.
@zhaoqi The difference between full and dev is the speed vs quality. Full recommends 50 steps vs dev recommends 28 steps. They consume the same amount of VRAM. I think Full is better than Dev but I have only been working with the NF4 models. I would try the q8 model and if its to slow then step down to q6 or q4.