File size: 2,983 Bytes
ea089a7
 
 
 
 
 
 
 
 
 
 
 
 
5768e03
 
 
 
 
6b11d61
5768e03
6b11d61
5768e03
6b11d61
5768e03
 
6b11d61
5768e03
6b11d61
5768e03
6b11d61
5768e03
6b11d61
5768e03
 
 
6b11d61
5768e03
6b11d61
5768e03
 
6b11d61
5768e03
 
 
6b11d61
5768e03
6b11d61
5768e03
6b11d61
5768e03
6b11d61
5768e03
6b11d61
 
5768e03
6b11d61
5768e03
6b11d61
5768e03
6b11d61
5768e03
 
6b11d61
5768e03
6b11d61
5768e03
6b11d61
5768e03
6b11d61
5768e03
 
6b11d61
5768e03
6b11d61
5768e03
 
 
6b11d61
5768e03
6b11d61
5768e03
6b11d61
5768e03
 
6b11d61
5768e03
 
 
 
 
ea089a7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
---
tags:
- unsloth
- mlx
base_model: unsloth/Qwen3-Coder-30B-A3B-Instruct
library_name: mlx
license: apache-2.0
license_link: https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct/blob/main/LICENSE
pipeline_tag: text-generation
---

# unsloth-Qwen3-Coder-30B-A3B-Instruct-qx4-mlx

Based on the benchmark results, qx4 would be best suited for:

Primary Task: BoolQ (Boolean Questions)

Why BoolQ is the Strength:

qx4 achieves 0.877 on BoolQ, which is the second-highest score in this dataset

Only slightly behind q5 (0.883) and qx5 (0.880)

This represents excellent performance on boolean reasoning tasks


Secondary Strengths:

HellaSwag

qx4 scores 0.552, which is the highest among all quantized models

This indicates superior performance on commonsense reasoning and scenario understanding

Arc_Challenge

qx4 scores 0.419, which is better than most other quantized models

Shows strong performance on challenging multiple-choice questions


Task Suitability Analysis:

Best Suited Tasks:

BoolQ - Strongest performer

HellaSwag - Highest among quantized models

Arc_Challenge - Better than most quantizations

Winogrande - Decent performance (0.567)


Other Tasks Where qx4 Performs Well:

Arc_Easy - 0.531 (solid performance)

OpenBookQA - 0.426 (adequate for knowledge-based tasks)

PIQA - 0.723 (good performance)


Limitations:

Weakest in OpenBookQA compared to qm68 (0.426 vs 0.430)

Below average on Winogrande (0.567)

Slightly lower than baseline on Arc_Easy


Recommendation:

Use qx4 when Boolean reasoning and commonsense understanding are critical, particularly for applications involving:

Question answering requiring boolean logic

Commonsense reasoning scenarios

Complex multiple-choice question solving

Tasks where HellaSwag performance is important


The model excels at combining logical reasoning (BoolQ) with contextual understanding (HellaSwag), making it ideal for applications that blend precise logical inference with real-world commonsense knowledge. Its performance is particularly strong in scenarios requiring nuanced reasoning about everyday situations and causal relationships.

Best for: AI assistants, question-answering systems requiring both logical precision and common-sense understanding.


This model [unsloth-Qwen3-Coder-30B-A3B-Instruct-qx4-mlx](https://huggingface.co/unsloth-Qwen3-Coder-30B-A3B-Instruct-qx4-mlx) was
converted to MLX format from [unsloth/Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/unsloth/Qwen3-Coder-30B-A3B-Instruct)
using mlx-lm version **0.26.3**.

## Use with mlx

```bash
pip install mlx-lm
```

```python
from mlx_lm import load, generate

model, tokenizer = load("unsloth-Qwen3-Coder-30B-A3B-Instruct-qx4-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)
```