skatardude10 commited on
Commit
2828793
·
verified ·
1 Parent(s): 40cdff1

Upload SnowDrogito-RpR3-32B_IQ4-XS+Enhanced_Tensors.gguf

Browse files

This model surpasses IQ4_XS precision in critical layers—using Q8_0, Q6_K, and Q5_K for FFN, attention, and embeddings to boost long-context coherence and accuracy—while keeping the overall file size comparable to Q4_K_M.

This quantization of SnowDrogito-RpR3-32B (IQ4_XS base) leverages a re-calibrated importance matrix (8192-token context, RP logs, Tao Te Ching, and a curated dataset) to improve over the prior attention-only targeted model (e.g., SnowDrogito-RpRv3-32B_IQ4-XS-Q8InOut-Q56Attn.gguf). Using llama.cpp's tensor-type regex, we targeted FFN down (Q8_0 for layers 62-63, Q6_K for 43-61, Q5_K for 29-42) and FFN up (Q6_K for 59-63) to prioritize late layers for long-context coherence and middle layers for narrative depth based on the re-calibrated importance matrix, while attention tensors (Q6_K/Q5_K for layers 14, 15, 23, 24, 26, 55-62) enhance contextual understanding to reduce issues like dialogue inconsistencies and factual errors. Q8_0 embeddings ensure token quality, making this model ideal for 40K token RP tasks with improved memory, accuracy, and dialogue consistency within a 1.508 GB VRAM increase.
![image (13).jpg](https://cdn-uploads.huggingface.co/production/uploads/633e3b4136e87ddc64ad584d/Byazxpw46xP8CKEzmt-cl.jpeg)

.gitattributes CHANGED
@@ -36,3 +36,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
36
  SnowDrogito-RpR-32B_IQ4-XS.gguf filter=lfs diff=lfs merge=lfs -text
37
  SnowDrogito-RpRv3-32B_IQ4-XS.gguf filter=lfs diff=lfs merge=lfs -text
38
  SnowDrogito-RpRv3-32B_IQ4-XS-Q8InOut-Q56Attn.gguf filter=lfs diff=lfs merge=lfs -text
 
 
36
  SnowDrogito-RpR-32B_IQ4-XS.gguf filter=lfs diff=lfs merge=lfs -text
37
  SnowDrogito-RpRv3-32B_IQ4-XS.gguf filter=lfs diff=lfs merge=lfs -text
38
  SnowDrogito-RpRv3-32B_IQ4-XS-Q8InOut-Q56Attn.gguf filter=lfs diff=lfs merge=lfs -text
39
+ SnowDrogito-RpR3-32B_IQ4-XS+Enhanced_Tensors.gguf filter=lfs diff=lfs merge=lfs -text
SnowDrogito-RpR3-32B_IQ4-XS+Enhanced_Tensors.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0c3ba42036e5cb80dab5586d721e78c9dd778e55e737f4e693968169ae41d031
3
+ size 19877727232