Best way to offload layers on cPU/RAM for 6x24gb GPU

#4
by djdeniro - opened

Hello! For 256 RAM + 144 VRAM what is best way to offload layers to CPU?

For Q2_K_XL and Q4_K_XL.

Thanks!

Unsloth AI org

I wrote about on how to offload more layers for more performance here: https://docs.unsloth.ai/basics/qwen3-coder#improving-generation-speed :)

Sign up or log in to comment