Not longer compatible with the newest llama.cpp

#22

by bero1985 - opened Nov 11, 2024

bero1985

Nov 11, 2024

It seems to be sadly not longer compatible with the newest llama.cpp. I get

llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
llama_load_model_from_file: failed to load model

Maybe update/recreate the GGUFs to the newer GGUF file format version?

samusempire

Jun 10

It seems to be sadly not longer compatible with the newest llama.cpp. I get
llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
llama_load_model_from_file: failed to load model
Maybe update/recreate the GGUFs to the newer GGUF file format version?

Hi, did you manage to solve the tensor issue? Going through the same right now.

Abdelhak

Jun 10

It seems to be sadly not longer compatible with the newest llama.cpp. I get
llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
llama_load_model_from_file: failed to load model
Maybe update/recreate the GGUFs to the newer GGUF file format version?
Hi, did you manage to solve the tensor issue? Going through the same right now.

I had the same issue, and I just redownloaded a new quant of the model. It works fine now.

samusempire

Jun 10

It seems to be sadly not longer compatible with the newest llama.cpp. I get
llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
llama_load_model_from_file: failed to load model
Maybe update/recreate the GGUFs to the newer GGUF file format version?
Hi, did you manage to solve the tensor issue? Going through the same right now.
I had the same issue, and I just redownloaded a new quant of the model. It works fine now.

Thanks for replying. Geez, I just can't seem to make it work with llama.cpp nor the llama-mixtral repo. RTX 3060 12G Ryzen 5 7600X 32G

Abdelhak

Jun 10

It seems to be sadly not longer compatible with the newest llama.cpp. I get
llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
llama_load_model_from_file: failed to load model
Maybe update/recreate the GGUFs to the newer GGUF file format version?
Hi, did you manage to solve the tensor issue? Going through the same right now.
I had the same issue, and I just redownloaded a new quant of the model. It works fine now.
Thanks for replying. Geez, I just can't seem to make it work with llama.cpp nor the llama-mixtral repo. RTX 3060 12G Ryzen 5 7600X 32G

Update to the last version of llama.cpp. I use LM Studio.
Download the model from this repo:
https://huggingface.co/mradermacher/Mixtral-v0.1-8x7B-Instruct-i1-GGUF

I highly suggest to use a smaller model like Mistral-nemo or Mistral-small-24B. The output is similar to Mixtral.
I run the Q5_K_L version of the model on 24GB or Vram, and I offload many layers. The speed is between 4-6t/s.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment