Fizzarolli commited on
Commit
1be2f05
·
verified ·
1 Parent(s): 96995d9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -56,7 +56,9 @@ MLX:
56
 
57
  - Format is plain-old ChatML (please note that, unlike regular Qwen 3, you do *not* need to prefill empty think tags for it not to reason -- see below).
58
 
59
- - Settings used by testers varied, but Fizz and inflatebot used the same settings and system prompt recommended for [GLM4-32B-Neon-v2.](https://huggingface.co/allura-org/GLM4-32B-Neon-v2)
 
 
60
 
61
  - The official instruction following version of Qwen3-8B was not used as a base. Instruction-following is trained in post-hoc, and "thinking" traces were not included. __As a result of this, "thinking" will not function.__
62
 
 
56
 
57
  - Format is plain-old ChatML (please note that, unlike regular Qwen 3, you do *not* need to prefill empty think tags for it not to reason -- see below).
58
 
59
+ - Settings used by testers varied, but we generally stayed around 0.9 temperature and 0.1 min p. Do *not* use repetition penalties (DRY included). They break it.
60
+
61
+ - Any system prompt can likely be used, but I used the Shingame system prompt (link will be added later i promise)
62
 
63
  - The official instruction following version of Qwen3-8B was not used as a base. Instruction-following is trained in post-hoc, and "thinking" traces were not included. __As a result of this, "thinking" will not function.__
64