weight dtype "default" very slow
First of all i have a 4090 and 64 GB of RAM in my machine
i am using comfy ui
now i am wondering what´s the problem with the weight dtype
when i set the weight dtype to "default" it takes up to 10 minuten for a simple 1024x1024 image to generate
but when i set the weight dtype to "fp8_e4m3fn" it is generating the images super fast - like 14 seconds
so what did i wrong? :D
it is btw not using a lot of RAM - only 12 GB of RAM used (not meaning VRAM)
default is bfloat16 and is ~23GB, so your 4090's vram is still not enough.
however, I can run default unet with 4070 12G and about 2.5s/it (uses a lot of ram).
so I think your case can be faster, maybe some wrong settings
For me it's the other way round if I use fp8_e4 or fp8_e5... it's really slow but on default I get 5s/it with my 3060 12G. I did the following. Updated Nvidia drivers and in Nvidia settings disable Sysmem Fallback policy so it will only use your VRAM and not your system ram.