Does llama.cpp load 1/4 of parameters work or a program candy?
#5
by
hermitg
- opened
I see official documents of https://docs.unsloth.ai/basics/qwen3-coder#llama.cpp-run-qwen3-tutorial shows that code
./llama.cpp/llama-cli \
--model unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF/UD-Q2_K_XL/Qwen3-Coder-480B-A35B-Instruct-UD-Q2_K_XL-00001-of-00004.gguf \
...
it seems just load 1/4 model parameters? is it a program candy?
Oh the first file already contains the metadata for loading the other 3 parts.
So when you load the first, it'll find the other 3 parts in the same folder, and load those as well
Oh the first file already contains the metadata for loading the other 3 parts.
So when you load the first, it'll find the other 3 parts in the same folder, and load those as well
thanks!
hermitg
changed discussion status to
closed