fuzzy-mittenz commited on
Commit
4aeca78
·
verified ·
1 Parent(s): 6d09165

Update README.md

Browse files

![thoth2.png](https://cdn-uploads.huggingface.co/production/uploads/6593502ca2607099284523db/5hpy4IHflPFigikhFPAKj.png)

Files changed (1) hide show
  1. README.md +4 -30
README.md CHANGED
@@ -34,6 +34,10 @@ model-index:
34
  ---
35
 
36
  # THOTH Experiment
 
 
 
 
37
  This model was converted to GGUF format from [`NousResearch/Hermes-3-Llama-3.2-3B`](https://huggingface.co/NousResearch/Hermes-3-Llama-3.2-3B) using llama.cpp.
38
  Refer to the [original model card](https://huggingface.co/NousResearch/Hermes-3-Llama-3.2-3B) for more details on the model.
39
 
@@ -46,33 +50,3 @@ brew install llama.cpp
46
  ```
47
  Invoke the llama.cpp server or the CLI.
48
 
49
- ### CLI:
50
- ```bash
51
- llama-cli --hf-repo fuzzy-mittenz/Hermes-3-Llama-3.2-3B-IQ4_NL-GGUF --hf-file hermes-3-llama-3.2-3b-iq4_nl-imat.gguf -p "The meaning to life and the universe is"
52
- ```
53
-
54
- ### Server:
55
- ```bash
56
- llama-server --hf-repo fuzzy-mittenz/Hermes-3-Llama-3.2-3B-IQ4_NL-GGUF --hf-file hermes-3-llama-3.2-3b-iq4_nl-imat.gguf -c 2048
57
- ```
58
-
59
- Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
60
-
61
- Step 1: Clone llama.cpp from GitHub.
62
- ```
63
- git clone https://github.com/ggerganov/llama.cpp
64
- ```
65
-
66
- Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
67
- ```
68
- cd llama.cpp && LLAMA_CURL=1 make
69
- ```
70
-
71
- Step 3: Run inference through the main binary.
72
- ```
73
- ./llama-cli --hf-repo fuzzy-mittenz/Hermes-3-Llama-3.2-3B-IQ4_NL-GGUF --hf-file hermes-3-llama-3.2-3b-iq4_nl-imat.gguf -p "The meaning to life and the universe is"
74
- ```
75
- or
76
- ```
77
- ./llama-server --hf-repo fuzzy-mittenz/Hermes-3-Llama-3.2-3B-IQ4_NL-GGUF --hf-file hermes-3-llama-3.2-3b-iq4_nl-imat.gguf -c 2048
78
- ```
 
34
  ---
35
 
36
  # THOTH Experiment
37
+
38
+ ![thoth2.png](https://cdn-uploads.huggingface.co/production/uploads/6593502ca2607099284523db/5hpy4IHflPFigikhFPAKj.png)
39
+
40
+ # Model is Experimental Imatrix Quant using "THE_KEY" Dataset in QAT
41
  This model was converted to GGUF format from [`NousResearch/Hermes-3-Llama-3.2-3B`](https://huggingface.co/NousResearch/Hermes-3-Llama-3.2-3B) using llama.cpp.
42
  Refer to the [original model card](https://huggingface.co/NousResearch/Hermes-3-Llama-3.2-3B) for more details on the model.
43
 
 
50
  ```
51
  Invoke the llama.cpp server or the CLI.
52