allura-org
/

GLM4-9B-Neon-v2

Text Generation

Model card Files Files and versions

AuriAetherwiing commited on Apr 26

Commit

a66e498

·

verified ·

1 Parent(s): a8a1326

Update README.md

Files changed (1) hide show

README.md +17 -0

README.md CHANGED Viewed

@@ -46,6 +46,23 @@ Min-P - 0.1
 Repetition Penalty - 1.03
 ```
 **Training config**
 <details><summary>See Axolotl config</summary>

 Repetition Penalty - 1.03
 ```
+**Running on KoboldCPP and other backends**
+To run GGUFs correctly, you need the most recent version of KoboldCPP, and to pass `--overridekv glm4.rope.dimension_count=int:64` to the CLI command or put `glm4.rope.dimension_count=int:64` into overridekv box in the GUI (under the Tokens tab at the very bottom).
+Thanks to DaringDuck and tofumagnate for info how to apply this fix.
+To run this model on vLLM, you'll need to build it from source from the git repo, GLM4 support haven't reached release yet. ExLLaMAv2 and v3 don't support GLM4 arch at the moment
+**Special Thanks**
+Once again, huge kudos to OwenArli for providing compute and helping with tuning along the way!
+Big thanks to Artus for providing free inference for pre-release showcase of this model!
+And big thanks to BeaverAI community for giving feedback and helping to figure out optimal settings!
+---
 **Training config**
 <details><summary>See Axolotl config</summary>