8 GB vram card

by matrix7878 - opened 21 days ago

21 days ago

What is the appropriate version for the RTX 3060 Ti 8GB card with 32GB RAM ..Q3? Q4 ? or some thing else... and another question... have you noticed that the 13b version is faster than Wan2.1?

wsbagnsv1

Owner 21 days ago

Yes it indeed is a lot faster (;

As for 8gb, you probably should use the distorch node that is deactivated on the example workflow, then just try out what works start with Q3 and work your way up (;

wsbagnsv1

Owner 20 days ago

I have updated the workflow for low vram cards (or even high end cards) just enable the distorch node and it will work without much speed lost (;

matrix7878

15 days ago

I have updated the workflow for low vram cards (or even high end cards) just enable the distorch node and it will work without much speed lost (;

Thanks alot <3

wsbagnsv1

Owner 15 days ago

There is a issue rn with the distorch node and its not working as expected, but the first run usually works, you can add a clear vram node after generation between the sampler and vae decode to prevent ooms, this means you need to load the models each generation but at least no ooms. Ive already created an issue thread on the github of the node dev (;

wsbagnsv1

Owner 15 days ago

At least that is what is happening with me

matrix7878

15 days ago

At least that is what is happening with me

it work with me fine ( or at lest i think so ) my question now..i downloaded ltxv-13b-0.9.7-dev-fp8.safetensors that 15.7 gb file....can your workflow work with it ? what nodes i have to delet and what i have to add.... also where to put this big ltxv-13b-0.9.7-dev-fp8.safetensors file....if you make same workflow to load it (i mean with some edit to make this ltxv-13b-0.9.7-dev-fp8.safetensors work with it) ...realy big thx in advanced....it very good workflow <3 <3

wsbagnsv1

Owner 15 days ago

First of all, what gpu do you have? Everything below rtx4000 does not profit from fp8, so you should use gguf quants instead, since they have better quality. If you have 4000 cards or higher, you basically just need to change the loader and add the Q8 patch, also install the kernels. If you need help write me your discord

wsbagnsv1

Owner 15 days ago

Ah nvm just saw you had a rtx3000 series, so i would advice against the fp8 model, its as big as Q8 but not really faster and worse quality. Also you cant use distorch to load it but need block swap, and idk if thats working on the new 13b model already

matrix7878

15 days ago

yes my card rtx 3060 ti ...so i cant ever use this f8 direct.... but i can use your ltxv-13b-0.9.7-dev-Q8_0.gguf or at lest ltxv-13b-0.9.7-dev-Q6_K.gguf
right? also you added lora node..so i hope this ltxv-13b-0.9.7-distilled-lora128.safetensors lora work fine

matrix7878

15 days ago

also I really don't know...is it better to download ltxv-13b-0.9.7-dev-Q6 with Laura or download ltxv-13b-0.9.7-distilled-Q6_K.gguf direct ... i mean i need the better quality too

wsbagnsv1

Owner 15 days ago

Just try both (;

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment