Not longer compatible with the newest llama.cpp
It seems to be sadly not longer compatible with the newest llama.cpp. I get
llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
llama_load_model_from_file: failed to load model
Maybe update/recreate the GGUFs to the newer GGUF file format version?
It seems to be sadly not longer compatible with the newest llama.cpp. I get
llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight' llama_load_model_from_file: failed to load model
Maybe update/recreate the GGUFs to the newer GGUF file format version?
Hi, did you manage to solve the tensor issue? Going through the same right now.
It seems to be sadly not longer compatible with the newest llama.cpp. I get
llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight' llama_load_model_from_file: failed to load model
Maybe update/recreate the GGUFs to the newer GGUF file format version?
Hi, did you manage to solve the tensor issue? Going through the same right now.
I had the same issue, and I just redownloaded a new quant of the model. It works fine now.
It seems to be sadly not longer compatible with the newest llama.cpp. I get
llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight' llama_load_model_from_file: failed to load model
Maybe update/recreate the GGUFs to the newer GGUF file format version?
Hi, did you manage to solve the tensor issue? Going through the same right now.
I had the same issue, and I just redownloaded a new quant of the model. It works fine now.
Thanks for replying. Geez, I just can't seem to make it work with llama.cpp nor the llama-mixtral repo. RTX 3060 12G Ryzen 5 7600X 32G
It seems to be sadly not longer compatible with the newest llama.cpp. I get
llama_model_load: error loading model: missing tensor 'blk.0.ffn_down_exps.weight' llama_load_model_from_file: failed to load model
Maybe update/recreate the GGUFs to the newer GGUF file format version?
Hi, did you manage to solve the tensor issue? Going through the same right now.
I had the same issue, and I just redownloaded a new quant of the model. It works fine now.
Thanks for replying. Geez, I just can't seem to make it work with llama.cpp nor the llama-mixtral repo. RTX 3060 12G Ryzen 5 7600X 32G
Update to the last version of llama.cpp. I use LM Studio.
Download the model from this repo:
https://huggingface.co/mradermacher/Mixtral-v0.1-8x7B-Instruct-i1-GGUF
I highly suggest to use a smaller model like Mistral-nemo or Mistral-small-24B. The output is similar to Mixtral.
I run the Q5_K_L version of the model on 24GB or Vram, and I offload many layers. The speed is between 4-6t/s.