Thireus
/

GLM-4.5-THIREUS-IQ3_KT-SPECIAL_SPLIT

Model card Files Files and versions

Thireus commited on 4 days ago

Commit

fedd15b

·

1 Parent(s): 5618e46

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -47,12 +47,12 @@ cd GGUF-Tool-Suite
 rm -f download.conf # Make sure to copy the relevant download.conf for the model before running quant_assign.py
 cp -f models/GLM-4.5/download.conf . # Use the download.conf of the chosen model
 mkdir -p kitchen && cd kitchen
-../quant_downloader.sh ../recipe_examples/GLM-4.5.ROOT-2.0085bpw-5.2486ppl.83GB-GGUF_7GB-GPU_76GB-CPU.a02563d_cdb0394.recipe
 # Other recipe examples can be found at https://github.com/Thireus/GGUF-Tool-Suite/tree/main/recipe_examples
 # Launch ik_llama's llama-server:
-ulimit -n 99999 # Lifts "too many open files" limitation on Linux
 ~/ik_llama.cpp/build/bin/llama-server \
   -m GLM-4.5-THIREUS-BF16-SPECIAL_TENSOR-00001-of-01762.gguf \
   -fa -fmoe -ctk f16 -c 4096 -ngl 99 \
@@ -86,6 +86,8 @@ Here’s how GLM-4.5 quantized with **Thireus’ GGUF Tool Suite** stacks up aga
 More perplexity/bpw graphs for other supported models: https://github.com/Thireus/GGUF-Tool-Suite/tree/main/ppl_graphs
 ---
 ## 🚀 How do I get started?

 rm -f download.conf # Make sure to copy the relevant download.conf for the model before running quant_assign.py
 cp -f models/GLM-4.5/download.conf . # Use the download.conf of the chosen model
 mkdir -p kitchen && cd kitchen
+../quant_downloader.sh ../recipe_examples/ik_harmonized_recipes/GLM-4.5.ROOT-4.1636bpw-3.2647ppl.173GB-GGUF_12GB-GPU_160GB-CPU.90e3c2f_1ac651c.recipe
 # Other recipe examples can be found at https://github.com/Thireus/GGUF-Tool-Suite/tree/main/recipe_examples
 # Launch ik_llama's llama-server:
+ulimit -n 9999 # Lifts "too many open files" limitation on Linux
 ~/ik_llama.cpp/build/bin/llama-server \
   -m GLM-4.5-THIREUS-BF16-SPECIAL_TENSOR-00001-of-01762.gguf \
   -fa -fmoe -ctk f16 -c 4096 -ngl 99 \
 More perplexity/bpw graphs for other supported models: https://github.com/Thireus/GGUF-Tool-Suite/tree/main/ppl_graphs
+*All PPL values are computed with the parameters `-ctk f16 -c 512 -b 4096 -ub 4096`. Changing any of these parameters will alter the PPL. In particular, reducing `-b 4096 -ub 4096` increases the PPL, while increasing them decreases the PPL.*
 ---
 ## 🚀 How do I get started?