What version of llamacpp that you use?

#1
by Hamora - opened

We use our own llama.cpp fork which you can find under https://github.com/nicoboss/llama.cpp.
Now that https://github.com/ggml-org/llama.cpp/pull/9400 is merged it is almost identical to upsteam and only adds some quality of life features like DRYRUN and some debug output. We last merged upstream and updated our workers on 22. July 2025 at 20:19 UTC +2.

I see that this one got quantized before 22. July 2025 at 20:19 UTC +2 so it must have been using the version from 18th of July 11:00 GMT+2. Back then our fork also still contained the 3D tensor imatrix MLA and storing imatrix data for tensors with partially uncovered expert fixes^on top of whatever was on mainline on that time.

If there was any llama.cpp change in the meantime that would affect this quants, please let us know and we can requant this model but keep in mind that in this case this conversation would get deleted as well.

Aight, solved! Thanks :) !

Sign up or log in to comment