allura-org
/

Q3-8B-Kintsugi

Text Generation

text-generation-inference

Model card Files Files and versions

Fizzarolli commited on Jun 16

Commit

1be2f05

·

verified ·

1 Parent(s): 96995d9

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -56,7 +56,9 @@ MLX:
 - Format is plain-old ChatML (please note that, unlike regular Qwen 3, you do *not* need to prefill empty think tags for it not to reason -- see below).
-- Settings used by testers varied, but Fizz and inflatebot used the same settings and system prompt recommended for [GLM4-32B-Neon-v2.](https://huggingface.co/allura-org/GLM4-32B-Neon-v2)
 - The official instruction following version of Qwen3-8B was not used as a base. Instruction-following is trained in post-hoc, and "thinking" traces were not included. __As a result of this, "thinking" will not function.__

 - Format is plain-old ChatML (please note that, unlike regular Qwen 3, you do *not* need to prefill empty think tags for it not to reason -- see below).
+- Settings used by testers varied, but we generally stayed around 0.9 temperature and 0.1 min p. Do *not* use repetition penalties (DRY included). They break it.
+- Any system prompt can likely be used, but I used the Shingame system prompt (link will be added later i promise)
 - The official instruction following version of Qwen3-8B was not used as a base. Instruction-following is trained in post-hoc, and "thinking" traces were not included. __As a result of this, "thinking" will not function.__