Does llama.cpp load 1/4 of parameters work or a program candy?

#5
by hermitg - opened

I see official documents of https://docs.unsloth.ai/basics/qwen3-coder#llama.cpp-run-qwen3-tutorial shows that code

./llama.cpp/llama-cli \
    --model unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF/UD-Q2_K_XL/Qwen3-Coder-480B-A35B-Instruct-UD-Q2_K_XL-00001-of-00004.gguf \
...

it seems just load 1/4 model parameters? is it a program candy?

Unsloth AI org

Oh the first file already contains the metadata for loading the other 3 parts.

So when you load the first, it'll find the other 3 parts in the same folder, and load those as well

Oh the first file already contains the metadata for loading the other 3 parts.

So when you load the first, it'll find the other 3 parts in the same folder, and load those as well

thanks!

hermitg changed discussion status to closed

Sign up or log in to comment