Thireus commited on
Commit
bdf7892
·
1 Parent(s): 00851d4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -38,12 +38,12 @@ cd GGUF-Tool-Suite
38
  rm -f download.conf # Make sure to copy the relevant download.conf for the model before running quant_assign.py
39
  cp -f models/DeepSeek-R1-0528/download.conf . # Use the download.conf of the chosen model
40
  mkdir -p kitchen && cd kitchen
41
- ../quant_downloader.sh ../recipe_examples/DeepSeek-R1-0528.THIREUS-1.9364bpw-4.3533ppl.151GB-GGUF_11GB-GPU_140GB-CPU.3c88ec6_9fd615d.recipe
42
 
43
  # Other recipe examples can be found at https://github.com/Thireus/GGUF-Tool-Suite/tree/main/recipe_examples
44
 
45
  # Launch ik_llama's llama-cli:
46
- ulimit -n 99999 # Lifts "too many open files" limitation on Linux
47
  ~/ik_llama.cpp/build/bin/llama-cli \
48
  -m DeepSeek-R1-0528-THIREUS-BF16-SPECIAL_TENSOR-00001-of-01148.gguf \
49
  -mla 3 -fa -amb 512 -fmoe -ctk f16 -c 4096 -ngl 99 \
@@ -76,6 +76,8 @@ Here’s how DeepSeek-R1-0528 quantized with **Thireus’ GGUF Tool Suite** stac
76
 
77
  More perplexity/bpw graphs for other supported models: https://github.com/Thireus/GGUF-Tool-Suite/tree/main/ppl_graphs
78
 
 
 
79
  ---
80
 
81
  ## 🚀 How do I get started?
 
38
  rm -f download.conf # Make sure to copy the relevant download.conf for the model before running quant_assign.py
39
  cp -f models/DeepSeek-R1-0528/download.conf . # Use the download.conf of the chosen model
40
  mkdir -p kitchen && cd kitchen
41
+ ../quant_downloader.sh ../recipe_examples/ik_harmonized_recipes/DeepSeek-R1-0528.ROOT-2.7921bpw-3.4451ppl.218GB-GGUF_14GB-GPU_204GB-CPU.90e3c2f_6f5170d.recipe
42
 
43
  # Other recipe examples can be found at https://github.com/Thireus/GGUF-Tool-Suite/tree/main/recipe_examples
44
 
45
  # Launch ik_llama's llama-cli:
46
+ ulimit -n 9999 # Lifts "too many open files" limitation on Linux
47
  ~/ik_llama.cpp/build/bin/llama-cli \
48
  -m DeepSeek-R1-0528-THIREUS-BF16-SPECIAL_TENSOR-00001-of-01148.gguf \
49
  -mla 3 -fa -amb 512 -fmoe -ctk f16 -c 4096 -ngl 99 \
 
76
 
77
  More perplexity/bpw graphs for other supported models: https://github.com/Thireus/GGUF-Tool-Suite/tree/main/ppl_graphs
78
 
79
+ *All PPL values are computed with the parameters `-ctk f16 -c 512 -b 4096 -ub 4096`. Changing any of these parameters will alter the PPL. In particular, reducing `-b 4096 -ub 4096` increases the PPL, while increasing them decreases the PPL.*
80
+
81
  ---
82
 
83
  ## 🚀 How do I get started?