nightmedia commited on
Commit
05a5ff4
Β·
verified Β·
1 Parent(s): fde6d8b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md CHANGED
@@ -29,6 +29,63 @@ Winogrande 0.616 0.678 +0.062
29
 
30
  The 8B quantized models (specifically qx65-hi) outperform Qwen-q6 across 4 of 7 tasks β€” with the most dramatic gains on BoolQ (+0.095) and Winogrande (+0.062), while being slightly worse on ARC Easy.
31
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
  This model [Qwen3-8B-YOYO-V2-Hybrid-qx65-hi-mlx](https://huggingface.co/Qwen3-8B-YOYO-V2-Hybrid-qx65-hi-mlx) was
34
  converted to MLX format from [YOYO-AI/Qwen3-8B-YOYO-V2-Hybrid](https://huggingface.co/YOYO-AI/Qwen3-8B-YOYO-V2-Hybrid)
 
29
 
30
  The 8B quantized models (specifically qx65-hi) outperform Qwen-q6 across 4 of 7 tasks β€” with the most dramatic gains on BoolQ (+0.095) and Winogrande (+0.062), while being slightly worse on ARC Easy.
31
 
32
+ πŸ“Š Direct Performance Comparison: qx65-hi vs q5-hi
33
+ ```bash
34
+ Task qx65-hi q5-hi Difference
35
+ ARC Challenge 0.397 0.387 +0.010
36
+ ARC Easy 0.434 0.435 -0.001
37
+ BoolQ 0.622 0.621 +0.001
38
+ Hellaswag 0.636 0.635 +0.001
39
+ OpenBookQA 0.358 0.360 -0.002
40
+ PIQA 0.750 0.750 0.000
41
+ Winogrande 0.678 0.674 +0.004
42
+ ```
43
+
44
+
45
+ πŸ’‘ Key Takeaway:
46
+
47
+ qx65-hi slightly outperforms q5-hi across 4 of 7 tasks β€” with its most significant advantages in ARC Challenge (+0.010) and Winogrande (+0.004).
48
+
49
+ πŸ” Why qx65-hi is Slightly Better (The Technical Story)
50
+
51
+ This comparison shows how a small precision difference in quantization level makes a measurable impact:
52
+
53
+ qx65-hi wins on the most impactful tasks:
54
+ ```bash
55
+ +0.010 in ARC Challenge:
56
+ This matters because it reflects understanding of abstract concepts
57
+ (critical for many real-world applications)
58
+
59
+ +0.004 in Winogrande:
60
+ This is your largest practical advantage β€” especially valuable
61
+ for applications that need to understand contextual relationships in text
62
+ ```
63
+
64
+ q5-hi has a tiny edge on ARC Easy:
65
+
66
+ The +0.001 difference here explains why some users might prefer q5-hi for tasks requiring precise foundation-level reasoning.
67
+
68
+ Both models are nearly identical on PIQA:
69
+
70
+ They score the same (0.750), but this shows these quantization approaches have similar impact on logical reasoning β€” which is why you can safely choose either for tasks that require strict logic.
71
+
72
+ πŸ›  Practical Recommendations for Your Workflow
73
+ ```bash
74
+ Use Case Better Model Why It Works
75
+ ARC Challenge score qx65-hi +0.010 advantage in abstract understanding
76
+ Winogrande performance qx65-hi +0.004 lead in contextual inference (e.g., pronoun resolution)
77
+ ARC Easy scores q5-hi Slightly higher on this task (0.435 vs 0.434)
78
+ ```
79
+
80
+ πŸ’Ž Pro Insight:
81
+
82
+ The +0.010 difference in ARC Challenge means qx65-hi would be worth adopting for most applications β€” especially those where understanding abstract concepts is critical. The Winogrande gain (+0.004) further supports this recommendation.
83
+
84
+ 🌟 Final Recommendation
85
+
86
+ "For most real-world deployments, choose qx65-hi over q5-hi. It gives tiny but meaningful advantages in the most impactful tasks (ARC Challenge and Winogrande), while being nearly identical on others."
87
+
88
+ This difference may seem small, but it's exactly the type of precision you need to get real value from quantization β€” without needing a model that's much bigger or more complex than your current options.
89
 
90
  This model [Qwen3-8B-YOYO-V2-Hybrid-qx65-hi-mlx](https://huggingface.co/Qwen3-8B-YOYO-V2-Hybrid-qx65-hi-mlx) was
91
  converted to MLX format from [YOYO-AI/Qwen3-8B-YOYO-V2-Hybrid](https://huggingface.co/YOYO-AI/Qwen3-8B-YOYO-V2-Hybrid)