Thireus commited on
Commit
fedd15b
·
1 Parent(s): 5618e46

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -47,12 +47,12 @@ cd GGUF-Tool-Suite
47
  rm -f download.conf # Make sure to copy the relevant download.conf for the model before running quant_assign.py
48
  cp -f models/GLM-4.5/download.conf . # Use the download.conf of the chosen model
49
  mkdir -p kitchen && cd kitchen
50
- ../quant_downloader.sh ../recipe_examples/GLM-4.5.ROOT-2.0085bpw-5.2486ppl.83GB-GGUF_7GB-GPU_76GB-CPU.a02563d_cdb0394.recipe
51
 
52
  # Other recipe examples can be found at https://github.com/Thireus/GGUF-Tool-Suite/tree/main/recipe_examples
53
 
54
  # Launch ik_llama's llama-server:
55
- ulimit -n 99999 # Lifts "too many open files" limitation on Linux
56
  ~/ik_llama.cpp/build/bin/llama-server \
57
  -m GLM-4.5-THIREUS-BF16-SPECIAL_TENSOR-00001-of-01762.gguf \
58
  -fa -fmoe -ctk f16 -c 4096 -ngl 99 \
@@ -86,6 +86,8 @@ Here’s how GLM-4.5 quantized with **Thireus’ GGUF Tool Suite** stacks up aga
86
 
87
  More perplexity/bpw graphs for other supported models: https://github.com/Thireus/GGUF-Tool-Suite/tree/main/ppl_graphs
88
 
 
 
89
  ---
90
 
91
  ## 🚀 How do I get started?
 
47
  rm -f download.conf # Make sure to copy the relevant download.conf for the model before running quant_assign.py
48
  cp -f models/GLM-4.5/download.conf . # Use the download.conf of the chosen model
49
  mkdir -p kitchen && cd kitchen
50
+ ../quant_downloader.sh ../recipe_examples/ik_harmonized_recipes/GLM-4.5.ROOT-4.1636bpw-3.2647ppl.173GB-GGUF_12GB-GPU_160GB-CPU.90e3c2f_1ac651c.recipe
51
 
52
  # Other recipe examples can be found at https://github.com/Thireus/GGUF-Tool-Suite/tree/main/recipe_examples
53
 
54
  # Launch ik_llama's llama-server:
55
+ ulimit -n 9999 # Lifts "too many open files" limitation on Linux
56
  ~/ik_llama.cpp/build/bin/llama-server \
57
  -m GLM-4.5-THIREUS-BF16-SPECIAL_TENSOR-00001-of-01762.gguf \
58
  -fa -fmoe -ctk f16 -c 4096 -ngl 99 \
 
86
 
87
  More perplexity/bpw graphs for other supported models: https://github.com/Thireus/GGUF-Tool-Suite/tree/main/ppl_graphs
88
 
89
+ *All PPL values are computed with the parameters `-ctk f16 -c 512 -b 4096 -ub 4096`. Changing any of these parameters will alter the PPL. In particular, reducing `-b 4096 -ub 4096` increases the PPL, while increasing them decreases the PPL.*
90
+
91
  ---
92
 
93
  ## 🚀 How do I get started?