Thireus
commited on
Commit
·
bdf7892
1
Parent(s):
00851d4
Update README.md
Browse files
README.md
CHANGED
@@ -38,12 +38,12 @@ cd GGUF-Tool-Suite
|
|
38 |
rm -f download.conf # Make sure to copy the relevant download.conf for the model before running quant_assign.py
|
39 |
cp -f models/DeepSeek-R1-0528/download.conf . # Use the download.conf of the chosen model
|
40 |
mkdir -p kitchen && cd kitchen
|
41 |
-
../quant_downloader.sh ../recipe_examples/DeepSeek-R1-0528.
|
42 |
|
43 |
# Other recipe examples can be found at https://github.com/Thireus/GGUF-Tool-Suite/tree/main/recipe_examples
|
44 |
|
45 |
# Launch ik_llama's llama-cli:
|
46 |
-
ulimit -n
|
47 |
~/ik_llama.cpp/build/bin/llama-cli \
|
48 |
-m DeepSeek-R1-0528-THIREUS-BF16-SPECIAL_TENSOR-00001-of-01148.gguf \
|
49 |
-mla 3 -fa -amb 512 -fmoe -ctk f16 -c 4096 -ngl 99 \
|
@@ -76,6 +76,8 @@ Here’s how DeepSeek-R1-0528 quantized with **Thireus’ GGUF Tool Suite** stac
|
|
76 |
|
77 |
More perplexity/bpw graphs for other supported models: https://github.com/Thireus/GGUF-Tool-Suite/tree/main/ppl_graphs
|
78 |
|
|
|
|
|
79 |
---
|
80 |
|
81 |
## 🚀 How do I get started?
|
|
|
38 |
rm -f download.conf # Make sure to copy the relevant download.conf for the model before running quant_assign.py
|
39 |
cp -f models/DeepSeek-R1-0528/download.conf . # Use the download.conf of the chosen model
|
40 |
mkdir -p kitchen && cd kitchen
|
41 |
+
../quant_downloader.sh ../recipe_examples/ik_harmonized_recipes/DeepSeek-R1-0528.ROOT-2.7921bpw-3.4451ppl.218GB-GGUF_14GB-GPU_204GB-CPU.90e3c2f_6f5170d.recipe
|
42 |
|
43 |
# Other recipe examples can be found at https://github.com/Thireus/GGUF-Tool-Suite/tree/main/recipe_examples
|
44 |
|
45 |
# Launch ik_llama's llama-cli:
|
46 |
+
ulimit -n 9999 # Lifts "too many open files" limitation on Linux
|
47 |
~/ik_llama.cpp/build/bin/llama-cli \
|
48 |
-m DeepSeek-R1-0528-THIREUS-BF16-SPECIAL_TENSOR-00001-of-01148.gguf \
|
49 |
-mla 3 -fa -amb 512 -fmoe -ctk f16 -c 4096 -ngl 99 \
|
|
|
76 |
|
77 |
More perplexity/bpw graphs for other supported models: https://github.com/Thireus/GGUF-Tool-Suite/tree/main/ppl_graphs
|
78 |
|
79 |
+
*All PPL values are computed with the parameters `-ctk f16 -c 512 -b 4096 -ub 4096`. Changing any of these parameters will alter the PPL. In particular, reducing `-b 4096 -ub 4096` increases the PPL, while increasing them decreases the PPL.*
|
80 |
+
|
81 |
---
|
82 |
|
83 |
## 🚀 How do I get started?
|