https://huggingface.co/aquiffoo/aquif-3-moe-17b

#1236
by FlameF0X - opened

AquifMoeForCausalLM is unfortinazely not currently supported by llama.cpp. By the way in case you wonder if a model is supported you can always just Ctrl & F for the archidecture user https://raw.githubusercontent.com/ggml-org/llama.cpp/refs/heads/master/convert_hf_to_gguf.py and if it is not listed there it is not currently supported by llama.cpp and so no GGUF quants can be made.

Actually I think the model author might be "cheating" because he released https://huggingface.co/aquiffoo/aquif-3-moe-17b-GGUF/tree/main for which he just replaced the archidecture and pre-tokenizer with bailingmoe so it seams to me as if the AquifMoeForCausalLM archidecture is "fake" and in reality it is just BailingMoeForCausalLM. So maybe I should give it a try anyways.

bruh, okay. Thanks for informing me. Have a nice day.

Converting to source GGUF successfully started. Let's hope it is actually a GGUF that will run using mainline llama.cpp but I see no reason why it shouldn't as it really is just BailingMoeForCausalLM. I didn't even had to modify llama.cpp in any way for it. All I did was just editing the model config and replaced AquifMoeForCausalLM with BailingMoeForCausalLM.

I'm also testing the newly implemented --outtype=source option:
venv/bin/python convert_hf_to_gguf.py /apool/aquif-3-moe-17b --outtype=source --outfile=/mradermacher/tmp/quant/aquif-3-moe-17b.gguf

Okay! Thank you for you time!

It's queued! :D
Sorry for the delay. I had to redo the source GUUF as in the meantime the model got renamed to aquif-3-moe-17b-a2.8b.

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/aquif-3-moe-17b-a2.8b-GGUF for quants to appear.

I checked your HuggingFace profile and have to say I'm really impressed with you training an entire model from scratch. How many GPU hours did it take you to train SnowflakeCore G1? I never tried training one from scratch as I always thought that there is absolutely no way I could ever afford the resources required to do so.

Sign up or log in to comment