Not longer compatible with the newest llama.cpp

#22
by bero1985 - opened

It seems to be sadly not longer compatible with the newest llama.cpp. I get

llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
llama_load_model_from_file: failed to load model

Maybe update/recreate the GGUFs to the newer GGUF file format version?

It seems to be sadly not longer compatible with the newest llama.cpp. I get

llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
llama_load_model_from_file: failed to load model

Maybe update/recreate the GGUFs to the newer GGUF file format version?

Hi, did you manage to solve the tensor issue? Going through the same right now.

It seems to be sadly not longer compatible with the newest llama.cpp. I get

llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
llama_load_model_from_file: failed to load model

Maybe update/recreate the GGUFs to the newer GGUF file format version?

Hi, did you manage to solve the tensor issue? Going through the same right now.

I had the same issue, and I just redownloaded a new quant of the model. It works fine now.

It seems to be sadly not longer compatible with the newest llama.cpp. I get

llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
llama_load_model_from_file: failed to load model

Maybe update/recreate the GGUFs to the newer GGUF file format version?

Hi, did you manage to solve the tensor issue? Going through the same right now.

I had the same issue, and I just redownloaded a new quant of the model. It works fine now.

Thanks for replying. Geez, I just can't seem to make it work with llama.cpp nor the llama-mixtral repo. RTX 3060 12G Ryzen 5 7600X 32G

It seems to be sadly not longer compatible with the newest llama.cpp. I get

llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
llama_load_model_from_file: failed to load model

Maybe update/recreate the GGUFs to the newer GGUF file format version?

Hi, did you manage to solve the tensor issue? Going through the same right now.

I had the same issue, and I just redownloaded a new quant of the model. It works fine now.

Thanks for replying. Geez, I just can't seem to make it work with llama.cpp nor the llama-mixtral repo. RTX 3060 12G Ryzen 5 7600X 32G

Update to the last version of llama.cpp. I use LM Studio.
Download the model from this repo:
https://huggingface.co/mradermacher/Mixtral-v0.1-8x7B-Instruct-i1-GGUF

I highly suggest to use a smaller model like Mistral-nemo or Mistral-small-24B. The output is similar to Mixtral.
I run the Q5_K_L version of the model on 24GB or Vram, and I offload many layers. The speed is between 4-6t/s.

Sign up or log in to comment