Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,14 @@ tags:
|
|
17 |
</p>
|
18 |
|
19 |
## <span style="color: #CCFFCC;">Overview</span>
|
20 |
-
SnowDrogito-RpR-32B_IQ4-XS is a QwQ RP Reasoning merge to add smarts to the popular <span style="color: #ADD8E6;">Snowdrop</span> roleplay model, with a little <span style="color: #FF9999;">ArliAI RpR</span> and <span style="color: #00FF00;">Deepcogito</span> for the smarts. Built using the TIES merge method, it
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
|
22 |
## <span style="color: #CCFFCC;">Performance</span>
|
23 |
- Perplexity under identical conditions (IQ4_XS, 40,960 context, Q8_0 KV cache, on a 150K-token chat dataset) SnowDrogito-RpR-32B vs <span style="color: #ADD8E6;">QwQ-32B-Snowdrop-v0</span>:
|
@@ -32,6 +39,7 @@ SnowDrogito-RpR-32B_IQ4-XS is a QwQ RP Reasoning merge to add smarts to the popu
|
|
32 |
- Architecture: Qwen 2.5 (32B parameters)
|
33 |
- Context Length: 40,960 tokens
|
34 |
- Quantization: IQ4_XS with <span style="color: #E6E6FA;">Q8_0 embeddings and output layers</span> for better quality.
|
|
|
35 |
|
36 |
## <span style="color: #CCFFCC;">Merge Configuration</span>
|
37 |
This model was created using mergekit with the following TIES merge configuration:
|
|
|
17 |
</p>
|
18 |
|
19 |
## <span style="color: #CCFFCC;">Overview</span>
|
20 |
+
SnowDrogito-RpR-32B_IQ4-XS is a QwQ RP Reasoning merge to add smarts to the popular <span style="color: #ADD8E6;">Snowdrop</span> roleplay model, with a little <span style="color: #FF9999;">ArliAI RpR</span> and <span style="color: #00FF00;">Deepcogito</span> for the smarts. Built using the TIES merge method, it attempts to combine strengths from multiple fine-tuned QwQ-32B models, quantized to IQ4_XS with <span style="color: #E6E6FA;">Q8_0 embeddings and output layers</span> for enhanced quality, to plus it up just a bit. Uploading because the PPL was lower, have been getting more varied/longer/more creative responses with this, but maybe it lacks contextual awareness compared to snowdrop? Not sure.
|
21 |
+
|
22 |
+
## <span style="color: #CCFFCC;">Setup for Reasoning and ChatML</span>
|
23 |
+
- **ChatML Formatting**: Use ChatML with `<|im_start|>role\ncontent<|im_end|>\n` (e.g., `<|im_start|>user\nHello!<|im_end|>\n`).
|
24 |
+
- **Reasoning Settings**: Set "include names" to "never." Start reply with `<think>\n` to enable reasoning.
|
25 |
+
- **Sampler Settings**: Try temperature 0.9, min_p 0.05, top_a 0.3, TFS 0.75, repetition_penalty 1.03, DRY if available.
|
26 |
+
|
27 |
+
For more details, see the setup guides and master import for ST for <a href="https://huggingface.co/trashpanda-org/QwQ-32B-Snowdrop-v0" style="color: #ADD8E6; text-decoration: none;" onmouseover="this.style.color='#E6E6FA'" onmouseout="this.style.color='#ADD8E6'">Snowdrop</a> and other info on <a href="https://huggingface.co/ArliAI/QwQ-32B-ArliAI-RpR-v1" style="color: #FF9999; text-decoration: none;" onmouseover="this.style.color='#E6E6FA'" onmouseout="this.style.color='#FF9999'">ArliAI RpR</a>.
|
28 |
|
29 |
## <span style="color: #CCFFCC;">Performance</span>
|
30 |
- Perplexity under identical conditions (IQ4_XS, 40,960 context, Q8_0 KV cache, on a 150K-token chat dataset) SnowDrogito-RpR-32B vs <span style="color: #ADD8E6;">QwQ-32B-Snowdrop-v0</span>:
|
|
|
39 |
- Architecture: Qwen 2.5 (32B parameters)
|
40 |
- Context Length: 40,960 tokens
|
41 |
- Quantization: IQ4_XS with <span style="color: #E6E6FA;">Q8_0 embeddings and output layers</span> for better quality.
|
42 |
+
- Used .imatrix file from Snowdrop.
|
43 |
|
44 |
## <span style="color: #CCFFCC;">Merge Configuration</span>
|
45 |
This model was created using mergekit with the following TIES merge configuration:
|