nightmedia commited on
Commit
7085956
·
verified ·
1 Parent(s): 839e294

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +56 -0
README.md CHANGED
@@ -13,6 +13,62 @@ library_name: mlx
13
 
14
  # Qwen3-30B-A3B-YOYO-V2-q6-hi-mlx
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  This model [Qwen3-30B-A3B-YOYO-V2-q6-hi-mlx](https://huggingface.co/Qwen3-30B-A3B-YOYO-V2-q6-hi-mlx) was
17
  converted to MLX format from [YOYO-AI/Qwen3-30B-A3B-YOYO-V2](https://huggingface.co/YOYO-AI/Qwen3-30B-A3B-YOYO-V2)
18
  using mlx-lm version **0.26.4**.
 
13
 
14
  # Qwen3-30B-A3B-YOYO-V2-q6-hi-mlx
15
 
16
+ This compares the YOYO-V2 model (a merge of Qwen's Thinking, Instruct, and Coder models) with the individual Thinking and Coder models to analyze how the merge impacted overall performance across different language intelligence tasks.
17
+
18
+ Note that the Instruct model isn't explicitly represented in this dataset (as it's excluded from the metrics).
19
+
20
+ Key Benchmark Comparison
21
+
22
+ Below is a breakdown of YOYO-V2's performance relative to the Thinking and Coder models across 7 tasks:
23
+ ```bash
24
+ Task YOYO-V2 Thinking Coder YOYO Advantage Over Coder
25
+ arc_challenge 0.532 0.414 0.417 +0.115
26
+ arc_easy 0.685 0.444 0.529 +0.156
27
+ boolq 0.886 0.702 0.881 +0.005 (slight gain over Coder)
28
+ hellaswag 0.683 0.632 0.545 +0.138
29
+ openbookqa 0.456 0.396 0.426 +0.030
30
+ piqa 0.782 0.763 0.720 +0.062
31
+ winogrande 0.639 0.666 0.572 +0.067
32
+ ```
33
+
34
+ How the Merge Affected Overall Performance
35
+
36
+ Net Positive Impact Across Tasks:
37
+
38
+ YOYO-V2 outperforms both Thinking and Coder models in 6 out of 7 tasks.
39
+
40
+ The most significant gains are seen in:
41
+ ```bash
42
+ arc_easy: YOYO-V2’s score jumps from 0.529 (Coder) to 0.685 (a +15.6% improvement).
43
+ hellaswag: YOYO-V2 shows a strong jump from 0.545 (Coder) to 0.683 (+25%).
44
+ piqa: YOYO-V2 achieves 0.782 vs. Coder’s 0.720 (+8%).
45
+ ```
46
+
47
+ Minor Trade-offs in Specific Tasks:
48
+
49
+ YOYO-V2 slightly underperforms Thinking on winogrande (0.639 vs. 0.666), but this is offset by its superiority in other tasks.
50
+
51
+ On boolq, YOYO-V2’s score is very close to Coder (0.886 vs. 0.881), showing minimal gains from the merge (likely due to task-specific alignment).
52
+
53
+
54
+ *Why This Matters:*
55
+
56
+ The merge likely leverages complementary strengths of the three Qwen models (e.g., Thinking for reasoning, Coder for code generation, and Instruct for instruction-following). YOYO-V2’s higher scores indicate the merge effectively harmonized these capabilities without severe drawbacks.
57
+
58
+ The overall trend is clear: the merged model achieves better or comparable results across the majority of benchmarks, with gains in downstream tasks that demand flexibility (e.g., reasoning, text generation).
59
+
60
+
61
+ *Conclusion*
62
+
63
+ YOYO-V2’s performance demonstrates that merging the Qwen Thinking, Coder, and Instruct models (at Q6 quantization) generally enhances overall task performance across diverse language intelligence benchmarks. The model shows the most dramatic improvements in tasks like arc_easy and hellaswag, where it excels by integrating specialized knowledge from each component model. While minor losses in a few tasks (e.g., winogrande) exist, the net effect is positive and robust, validating YOYO-V2 as a stronger multi-purpose model for real-world applications.
64
+
65
+ Takeaway: For Qwen users, YOYO-V2 is recommended if your use cases span reasoning (arc), code generation (Coder), and instruction-following (Instruct) – it provides a more balanced, high-performing solution than the base models alone.
66
+
67
+ --reviewed by qwen3-jan-v1-256k-ctx-6b-brainstorm20x-qx6-mlx
68
+
69
+ The hi model is improving over q6 by quanting with group size 32 and should perform better than the q6
70
+
71
+
72
  This model [Qwen3-30B-A3B-YOYO-V2-q6-hi-mlx](https://huggingface.co/Qwen3-30B-A3B-YOYO-V2-q6-hi-mlx) was
73
  converted to MLX format from [YOYO-AI/Qwen3-30B-A3B-YOYO-V2](https://huggingface.co/YOYO-AI/Qwen3-30B-A3B-YOYO-V2)
74
  using mlx-lm version **0.26.4**.