Unknown Architecture 'hunyuan-moe' (using --jinja with llama-server)

#4
by x-polyglot-x - opened

Hi unlsoth!

Thanks for uploading this model. I can't wait to try it out!

I'm running into the unknown architecture error above with this command:

  • llama-server -m Documents/llm/hunyuan_a13b/Hunyuan-A13B-Instruct-Q8_0-00001-of-00002.gguf -ngl 99 --jinja --temp 0.7 --top-k 20 --top-p 0.8 --repeat-penalty 1.05

I also tried again using the direct line of code you mentioned at the top of the model card page:

  • llama-cli -hf unsloth/Hunyuan-A13B-Instruct-GGUF:Q4_K_XL -ngl 99 --jinja --temp 0.7 --top-k 20 --top-p 0.8 --repeat-penalty 1.05

Any advice on what to try? I am on an M4 Max if that makes any difference.

Thanks again!

What build of llamacpp are you using? The support for the model was added in b5843.

I was on an older version. I used brew upgrade llama.cpp and that got me to version 5840 (which is the latest stable version). Sadly, that still isn't b5843. I downloaded the zip file for Mac ARM and trying to find how to install the newest version now.

Thanks for this tip!!

I was able to successfully run the model after using cmake to build the latest llama.cpp.

The following command now works: llama.cpp/build/bin/llama-server -m "Documents/llm/hunyuan_a13b/Hunyuan-A13B-Instruct-Q8_0-00001-of-00002.gguf" -ngl 99 --jinja --temp 0.7 --top-k 20 --top-p 0.8 --repeat-penalty 1.05

Thank you very much for your assistance!!

x-polyglot-x changed discussion status to closed
x-polyglot-x changed discussion status to open

I was able to successfully run the model after using cmake to build the latest llama.cpp.

The following command now works: llama.cpp/build/bin/llama-server -m "Documents/llm/hunyuan_a13b/Hunyuan-A13B-Instruct-Q8_0-00001-of-00002.gguf" -ngl 99 --jinja --temp 0.7 --top-k 20 --top-p 0.8 --repeat-penalty 1.05

Thank you very much for your assistance!!

I am facing the same problem. I git cloned the llama.cpp repo. After cmake I get llama-server version 5830. git branch says I am on master.
The releases are already at 5854: https://github.com/ggml-org/llama.cpp/releases
How can I update my directory to that? Do I have to git checkout to a different branch? And if yes, to which one?

I followed a guide from a Medium article (link here: https://medium.com/@jackcheang5/running-llama-cpp-in-mac-22e71123b811)

NOTE: This is for Apple silicon devices!!

First, clone the git hub repository

git clone https://github.com/ggerganov/llama.cpp

then run these steps

cd llama.cpp
mkdir build
cd build
cmake .. -DCMAKE_APPLE_SILICON_PROCESSOR=arm64
make -j

after that, launch llama-server from that directory (in other words, point to it directly)

llama.cpp/build/bin/llama-server -m "Documents/llm/hunyuan_a13b/Hunyuan-A13B-Instruct-Q8_0-00001-of-00002.gguf" -ngl 99 --jinja --temp 0.7 --top-k 20 --top-p 0.8 --repeat-penalty 1.05

The code above assumes you are launching terminal and the 'llama.cpp' folder is in your user directory (default). It also assumes that your model is stores in your Documents folder (you would need to change likely both of these paths).

Sign up or log in to comment