Add llama.cpp to the examples

Let's add llama.cpp to the examples, considering 2 of the other applications mentioned are powered by the llama.cpp library (ggml).

https://github.com/ggml-org/ggml
https://github.com/ggml-org/llama.cpp

Files changed (1) hide show

README.md +17 -0

README.md CHANGED Viewed

@@ -394,6 +394,23 @@ docker run -it --rm --pull=always \
 Click “see advanced setting” on the second line.
 In the new tab, toggle advanced to on. Set the custom model to be mistral/devstralq4_k_m and Base URL the api address we get from the last step in LM Studio. Set API Key to dummy. Click save changes.
 ### Ollama

 Click “see advanced setting” on the second line.
 In the new tab, toggle advanced to on. Set the custom model to be mistral/devstralq4_k_m and Base URL the api address we get from the last step in LM Studio. Set API Key to dummy. Click save changes.
+### llama.cpp
+Download the weights from huggingface:
+```
+pip install -U "huggingface_hub[cli]"
+huggingface-cli download \
+"mistralai/Devstral-Small-2505_gguf" \
+--include "devstralQ4_K_M.gguf" \
+--local-dir "mistralai/Devstral-Small-2505_gguf/"
+```
+Then run Devstral using the llama.cpp CLI.
+```bash
+./llama-cli -m Devstral-Small-2505_gguf/devstralQ4_K_M.gguf -cnv
+```
 ### Ollama