skatardude10
/

SnowDrogito-RpR-32B_IQ4-XS

@@ -17,7 +17,14 @@ tags:
 </p>
 ## <span style="color: #CCFFCC;">Overview</span>
-SnowDrogito-RpR-32B_IQ4-XS is a QwQ RP Reasoning merge to add smarts to the popular <span style="color: #ADD8E6;">Snowdrop</span> roleplay model, with a little <span style="color: #FF9999;">ArliAI RpR</span> and <span style="color: #00FF00;">Deepcogito</span> for the smarts. Built using the TIES merge method, it combines strengths from multiple fine-tuned QwQ-32B models, quantized to IQ4_XS with <span style="color: #E6E6FA;">Q8_0 embeddings and output layers</span> for enhanced quality, to plus it up just a bit...
 ## <span style="color: #CCFFCC;">Performance</span>
 - Perplexity under identical conditions (IQ4_XS, 40,960 context, Q8_0 KV cache, on a 150K-token chat dataset) SnowDrogito-RpR-32B vs <span style="color: #ADD8E6;">QwQ-32B-Snowdrop-v0</span>:
@@ -32,6 +39,7 @@ SnowDrogito-RpR-32B_IQ4-XS is a QwQ RP Reasoning merge to add smarts to the popu
 - Architecture: Qwen 2.5 (32B parameters)
 - Context Length: 40,960 tokens
 - Quantization: IQ4_XS with <span style="color: #E6E6FA;">Q8_0 embeddings and output layers</span> for better quality.
 ## <span style="color: #CCFFCC;">Merge Configuration</span>
 This model was created using mergekit with the following TIES merge configuration:

 </p>
 ## <span style="color: #CCFFCC;">Overview</span>
+SnowDrogito-RpR-32B_IQ4-XS is a QwQ RP Reasoning merge to add smarts to the popular <span style="color: #ADD8E6;">Snowdrop</span> roleplay model, with a little <span style="color: #FF9999;">ArliAI RpR</span> and <span style="color: #00FF00;">Deepcogito</span> for the smarts. Built using the TIES merge method, it attempts to combine strengths from multiple fine-tuned QwQ-32B models, quantized to IQ4_XS with <span style="color: #E6E6FA;">Q8_0 embeddings and output layers</span> for enhanced quality, to plus it up just a bit. Uploading because the PPL was lower, have been getting more varied/longer/more creative responses with this, but maybe it lacks contextual awareness compared to snowdrop? Not sure.
+## <span style="color: #CCFFCC;">Setup for Reasoning and ChatML</span>
+- **ChatML Formatting**: Use ChatML with `<|im_start|>role\ncontent<|im_end|>\n` (e.g., `<|im_start|>user\nHello!<|im_end|>\n`).
+- **Reasoning Settings**: Set "include names" to "never." Start reply with `<think>\n` to enable reasoning.
+- **Sampler Settings**: Try temperature 0.9, min_p 0.05, top_a 0.3, TFS 0.75, repetition_penalty 1.03, DRY if available.
+For more details, see the setup guides and master import for ST for <a href="https://huggingface.co/trashpanda-org/QwQ-32B-Snowdrop-v0" style="color: #ADD8E6; text-decoration: none;" onmouseover="this.style.color='#E6E6FA'" onmouseout="this.style.color='#ADD8E6'">Snowdrop</a> and other info on <a href="https://huggingface.co/ArliAI/QwQ-32B-ArliAI-RpR-v1" style="color: #FF9999; text-decoration: none;" onmouseover="this.style.color='#E6E6FA'" onmouseout="this.style.color='#FF9999'">ArliAI RpR</a>.
 ## <span style="color: #CCFFCC;">Performance</span>
 - Perplexity under identical conditions (IQ4_XS, 40,960 context, Q8_0 KV cache, on a 150K-token chat dataset) SnowDrogito-RpR-32B vs <span style="color: #ADD8E6;">QwQ-32B-Snowdrop-v0</span>:
 - Architecture: Qwen 2.5 (32B parameters)
 - Context Length: 40,960 tokens
 - Quantization: IQ4_XS with <span style="color: #E6E6FA;">Q8_0 embeddings and output layers</span> for better quality.
+- Used .imatrix file from Snowdrop.
 ## <span style="color: #CCFFCC;">Merge Configuration</span>
 This model was created using mergekit with the following TIES merge configuration: