DavidAU
/

L3.1-MOE-6X8B-Dark-Reasoning-Dantes-Peak-HORROR-R1-Uncensored-36B-GGUF

@@ -76,6 +76,8 @@ Higher temps will result in deeper, richer "thoughts"... and frankly more intere
 With the MOE setup, this model's thinking/output is even stronger.
 The "thinking/reasoning" tech (for the model at this repo) is from the original Llama 3.1 "DeepHermes" model from NousResearch:
 [ https://huggingface.co/NousResearch/DeepHermes-3-Llama-3-8B-Preview ]
@@ -85,6 +87,38 @@ Please visit their repo for all information on features, test results and so on.
 ---
 <B>IMPORTANT OPERATING INSTRUCTIONS:</B>
 This is an instruct model with reasoning crafted onto the 6 CORE models in a MOE config.
@@ -106,12 +140,6 @@ Note that the reasoning/thinking section is often a lot less "tame" than the fin
 Suggest a minimum context of 4k , but 8k is better due to reasoning/output blocks.
-MAX QUANTS:
-There will be two max quants, IQ4XS and Q8 ("MAX" in the file name).
-The thinking/output will be enhanced by the output tensor being enlarged to bf16.
 KNOWN ISSUES:
 - You may need to hit regen sometimes to get the thinking/reasoning to activate / get a good "thinking block".
@@ -119,7 +147,6 @@ KNOWN ISSUES:
 - Sometimes the thinking block will end, and you need to manually prompt the model to "generate" the output.
 - This model can sometimes generate really long output and/or never want to "end" the output - close to a rant, but deeper. It is surprising what it can come up with.
 <B>USE CASES:</B>
 This model is for all use cases, and but designed for creative use cases specifically.

 With the MOE setup, this model's thinking/output is even stronger.
+The "Horror Imatrix" was built using Grand Horror 16B (at my repo). This adds a "tint" of horror to the model.
 The "thinking/reasoning" tech (for the model at this repo) is from the original Llama 3.1 "DeepHermes" model from NousResearch:
 [ https://huggingface.co/NousResearch/DeepHermes-3-Llama-3-8B-Preview ]
 ---
+<b>"HORROR IMATRIX" and Quants</b>
+A strong, in house built, imatrix dataset built by David_AU which results in better overall function,
+instruction following, output quality and stronger connections to ideas, concepts and the world in general.
+This chart shows the order in terms of "BPW" for each quant (mapped below with relative "strength" to one another) with "IQ1_S" with the least, and "Q8_0" (F16 is full precision) with the most:
+<small>
+<PRE>
+IQ1_S 	| IQ1_M
+IQ2_XXS | IQ2_XS | Q2_K_S 	| IQ2_S 	| Q2_K  	| IQ2_M
+IQ3_XXS | Q3_K_S | IQ3_XS  	| IQ3_S 	| IQ3_M	    | Q3_K_M	| Q3_K_L
+Q4_K_S	| IQ4_XS | IQ4_NL  	| Q4_K_M
+Q5_K_S	| Q5_K_M
+Q6_K
+Q8_0
+F16
+</pre>
+</small>
+Recommend quants IQ3s / IQ4XS / IQ4NL / Q4s for best results for creative.
+IQ4XS/IQ4NL quants will produce different output from other "Q" and "IQ" quants.
+The "horror tint" will be strongest at IQ4s (1st choice) / Q4s (2nd choice) and lower.
+Recommend q5s/q6/q8 for general usage.
+Note that IQ1s performance is acceptable, whereas IQ2s+ are strong.
+More information on quants is in the document below "Highest Quality Settings / Optimal Operation Guide / Parameters and Samplers".
 <B>IMPORTANT OPERATING INSTRUCTIONS:</B>
 This is an instruct model with reasoning crafted onto the 6 CORE models in a MOE config.
 Suggest a minimum context of 4k , but 8k is better due to reasoning/output blocks.
 KNOWN ISSUES:
 - You may need to hit regen sometimes to get the thinking/reasoning to activate / get a good "thinking block".
 - Sometimes the thinking block will end, and you need to manually prompt the model to "generate" the output.
 - This model can sometimes generate really long output and/or never want to "end" the output - close to a rant, but deeper. It is surprising what it can come up with.
 <B>USE CASES:</B>
 This model is for all use cases, and but designed for creative use cases specifically.