Thireus
commited on
Commit
·
193a9be
1
Parent(s):
79266e1
Update README.md
Browse files
README.md
CHANGED
@@ -38,10 +38,10 @@ cd GGUF-Tool-Suite
|
|
38 |
rm -f download.conf # Make sure to copy the relevant download.conf for the model before running quant_assign.py
|
39 |
cp -f models/DeepSeek-TNG-R1T2-Chimera/download.conf . # Use the download.conf of the chosen model
|
40 |
mkdir -p kitchen && cd kitchen
|
41 |
-
../quant_downloader.sh ../recipe_examples/DeepSeek-TNG-R1T2-Chimera.ROOT-3.0624bpw-3.3657ppl.238GB-GGUF_11GB-GPU_227GB-CPU.13549e6_1ac857a.recipe
|
42 |
|
43 |
# Launch ik_llama's llama-cli:
|
44 |
-
ulimit -n
|
45 |
~/ik_llama.cpp/build/bin/llama-cli \
|
46 |
-m DeepSeek-TNG-R1T2-Chimera-THIREUS-BF16-SPECIAL_TENSOR-00001-of-01148.gguf \
|
47 |
-mla 3 -fa -amb 512 -fmoe -ctk f16 -c 4096 -ngl 99 \
|
@@ -74,6 +74,8 @@ Here’s how DeepSeek-R1-0528 quantized with **Thireus’ GGUF Tool Suite** stac
|
|
74 |
|
75 |
More perplexity/bpw graphs for other supported models: https://github.com/Thireus/GGUF-Tool-Suite/tree/main/ppl_graphs
|
76 |
|
|
|
|
|
77 |
---
|
78 |
|
79 |
## 🚀 How do I get started?
|
|
|
38 |
rm -f download.conf # Make sure to copy the relevant download.conf for the model before running quant_assign.py
|
39 |
cp -f models/DeepSeek-TNG-R1T2-Chimera/download.conf . # Use the download.conf of the chosen model
|
40 |
mkdir -p kitchen && cd kitchen
|
41 |
+
../quant_downloader.sh ../recipe_examples/ik_llama.cpp_recipes/DeepSeek-TNG-R1T2-Chimera.ROOT-3.0624bpw-3.3657ppl.238GB-GGUF_11GB-GPU_227GB-CPU.13549e6_1ac857a.recipe
|
42 |
|
43 |
# Launch ik_llama's llama-cli:
|
44 |
+
ulimit -n 9999 # Lifts "too many open files" limitation on Linux
|
45 |
~/ik_llama.cpp/build/bin/llama-cli \
|
46 |
-m DeepSeek-TNG-R1T2-Chimera-THIREUS-BF16-SPECIAL_TENSOR-00001-of-01148.gguf \
|
47 |
-mla 3 -fa -amb 512 -fmoe -ctk f16 -c 4096 -ngl 99 \
|
|
|
74 |
|
75 |
More perplexity/bpw graphs for other supported models: https://github.com/Thireus/GGUF-Tool-Suite/tree/main/ppl_graphs
|
76 |
|
77 |
+
*All PPL values are computed with the parameters `-ctk f16 -c 512 -b 4096 -ub 4096`. Changing any of these parameters will alter the PPL. In particular, reducing `-b 4096 -ub 4096` increases the PPL, while increasing them decreases the PPL.*
|
78 |
+
|
79 |
---
|
80 |
|
81 |
## 🚀 How do I get started?
|