model loading error
clip_init: failed to load model '/root/.cache/llama.cpp/unsloth_cogito-v2-preview-llama-109B-MoE-GGUF_mmproj-F16.gguf': operator(): unable to find tensor v.blk.0.attn_k.weight
mtmd_init_from_file: error: Failed to load CLIP model from /root/.cache/llama.cpp/unsloth_cogito-v2-preview-llama-109B-MoE-GGUF_mmproj-F16.gguf
srv load_model: failed to load multimodal model, '/root/.cache/llama.cpp/unsloth_cogito-v2-preview-llama-109B-MoE-GGUF_mmproj-F16.gguf'
srv operator(): operator(): cleaning up before exit...
main: exiting due to model loading error
./llama-server
-hf unsloth/cogito-v2-preview-llama-109B-MoE-GGUF:Q6_K_XL
--n-gpu-layers 99
--jinja
--threads 36
--threads-batch 24
-sm row
--temp 0.6
--min-p 0.01
--top-p 0.9
--ctx-size 16384
--no-context-shift
--port 8080
--host 0.0.0.0
--metrics \
also got the same error.
If you're like me and don't care about multimodal inputs, add --no-mmproj
to the args, it'll ignore. Otherwise - download/copy original files from other GGUF repos where they are present.
Nonetheless, model seems to output only: :
when called with below args:
-hf unsloth/cogito-v2-preview-llama-109B-MoE-GGUF:Q3_K_XL --cache-type-k q4_0 --n-gpu-layers 99 --ctx-size 8192 -ot \".ffn_.*_exps.=CPU\" --no-mmproj -a Cogito2-Scout
Edit: sample HTTP log https://pastebin.com/Xmfeyb27
Thank you. Yes I don't need images as of now.
Tried removing kv quantisation, adjusting ctx size, adding --jinja
, model still only outputs: ::::::
, tried to compare GGUF's with Unsloth's Llama 4 Scout, but couldn't spot any obvious difference that'd lead to such behavior
--jinja --n-gpu-layers 99 --ctx-size 16384 -ot \".ffn_.*_exps.=CPU\" --no-mmproj -a Cogito2-Scout
Same output for me too.
update, I redownloaded model and its now working.
Ok, it was most likely an issue with the Q3_K_XL
Just re-downloaded the Q4_K_S
, following args:
-hf unsloth/cogito-v2-preview-llama-109B-MoE-GGUF:Q4_K_S
--parallel 1 -ngl 12 --ctx-size 4096 --no-mmproj -a Cogito2-Scout
Produces outputs as expected
My output after the --no-mmproj is "GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG". I will try to redownload the model.
Edit: Redownloading fixed the repetition error.