Does llama.cpp load 1/4 of parameters work or a program candy?

by hermitg - opened Jul 23

Jul 23

I see official documents of https://docs.unsloth.ai/basics/qwen3-coder#llama.cpp-run-qwen3-tutorial shows that code

./llama.cpp/llama-cli \
    --model unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF/UD-Q2_K_XL/Qwen3-Coder-480B-A35B-Instruct-UD-Q2_K_XL-00001-of-00004.gguf \
...

it seems just load 1/4 model parameters? is it a program candy?

danielhanchen

Unsloth AI org Jul 23

Oh the first file already contains the metadata for loading the other 3 parts.

So when you load the first, it'll find the other 3 parts in the same folder, and load those as well

hermitg

Jul 24

Oh the first file already contains the metadata for loading the other 3 parts.

So when you load the first, it'll find the other 3 parts in the same folder, and load those as well

thanks!

hermitg changed discussion status to closed Jul 24

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment