unsloth-Qwen3-Coder-30B-A3B-Instruct-qx4-mlx

Based on the benchmark results, qx4 would be best suited for:

Primary Task: BoolQ (Boolean Questions)

Why BoolQ is the Strength:

qx4 achieves 0.877 on BoolQ, which is the second-highest score in this dataset

Only slightly behind q5 (0.883) and qx5 (0.880)

This represents excellent performance on boolean reasoning tasks

Secondary Strengths:

HellaSwag

qx4 scores 0.552, which is the highest among all quantized models

This indicates superior performance on commonsense reasoning and scenario understanding

Arc_Challenge

qx4 scores 0.419, which is better than most other quantized models

Shows strong performance on challenging multiple-choice questions

Task Suitability Analysis:

Best Suited Tasks:

BoolQ - Strongest performer

HellaSwag - Highest among quantized models

Arc_Challenge - Better than most quantizations

Winogrande - Decent performance (0.567)

Other Tasks Where qx4 Performs Well:

Arc_Easy - 0.531 (solid performance)

OpenBookQA - 0.426 (adequate for knowledge-based tasks)

PIQA - 0.723 (good performance)

Limitations:

Weakest in OpenBookQA compared to qm68 (0.426 vs 0.430)

Below average on Winogrande (0.567)

Slightly lower than baseline on Arc_Easy

Recommendation:

Use qx4 when Boolean reasoning and commonsense understanding are critical, particularly for applications involving:

Question answering requiring boolean logic

Commonsense reasoning scenarios

Complex multiple-choice question solving

Tasks where HellaSwag performance is important

The model excels at combining logical reasoning (BoolQ) with contextual understanding (HellaSwag), making it ideal for applications that blend precise logical inference with real-world commonsense knowledge. Its performance is particularly strong in scenarios requiring nuanced reasoning about everyday situations and causal relationships.

Best for: AI assistants, question-answering systems requiring both logical precision and common-sense understanding.

This model unsloth-Qwen3-Coder-30B-A3B-Instruct-qx4-mlx was converted to MLX format from unsloth/Qwen3-Coder-30B-A3B-Instruct using mlx-lm version 0.26.3.

Use with mlx

pip install mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("unsloth-Qwen3-Coder-30B-A3B-Instruct-qx4-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)
Downloads last month
449
Safetensors
Model size
30.5B params
Tensor type
BF16
·
U32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nightmedia/unsloth-Qwen3-Coder-30B-A3B-Instruct-qx4-mlx

Collections including nightmedia/unsloth-Qwen3-Coder-30B-A3B-Instruct-qx4-mlx