What version of llamacpp that you use?

by Hamora - opened 15 days ago

Discussion

Hamora

15 days ago

nicoboss

15 days ago

•

edited 15 days ago

We use our own llama.cpp fork which you can find under https://github.com/nicoboss/llama.cpp.
Now that https://github.com/ggml-org/llama.cpp/pull/9400 is merged it is almost identical to upsteam and only adds some quality of life features like DRYRUN and some debug output. We last merged upstream and updated our workers on 22. July 2025 at 20:19 UTC +2.

nicoboss

15 days ago

I see that this one got quantized before 22. July 2025 at 20:19 UTC +2 so it must have been using the version from 18th of July 11:00 GMT+2. Back then our fork also still contained the 3D tensor imatrix MLA and storing imatrix data for tensors with partially uncovered expert fixes^on top of whatever was on mainline on that time.

nicoboss

15 days ago

If there was any llama.cpp change in the meantime that would affect this quants, please let us know and we can requant this model but keep in mind that in this case this conversation would get deleted as well.

Hamora

15 days ago

Aight, solved! Thanks :) !

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment