This model is finetuned with unsloth using Qlora.

  • Base Model:unsloth/Qwen3-4B-unsloth-bnb-4bit
  • Parameters: 4,088,528,384
  • Dataset: 0.65 0f "unsloth/OpenmathReasoning-mini and 0.35 of "mlabonne/FineTome-100k". combination of reasoning and nonreasoning dataset.

Comparision to Qwen3-4B.

  • Eval on datasets:gpqa,arc,competition_math.gsm8k.
  • unsloth/Qwen3-4B-unsloth-bnb-4bit:
  • +---------+------------------+-----------------+---------------+-------+---------+---------+
    | Model   | Dataset          | Metric          | Subset        |   Num |   Score | Cat.0   |
    +=========+==================+=================+===============+=======+=========+=========+
    | qwen    | arc              | AverageAccuracy | ARC-Easy      |    30 |  0.8    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | qwen    | arc              | AverageAccuracy | ARC-Challenge |    30 |  0.7    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | qwen    | arc              | AverageAccuracy | OVERALL       |    60 |  0.75   | -       |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | qwen    | competition_math | AveragePass@1   | Level 1       |    30 |  0.3    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | qwen    | competition_math | AveragePass@1   | Level 2       |    30 |  0.2    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | qwen    | competition_math | AveragePass@1   | Level 3       |    30 |  0.1333 | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | qwen    | competition_math | AveragePass@1   | Level 4       |    30 |  0.0667 | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | qwen    | competition_math | AveragePass@1   | Level 5       |    30 |  0.0667 | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | qwen    | competition_math | AveragePass@1   | OVERALL       |   150 |  0.1533 | -       |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | qwen    | gpqa             | AveragePass@1   | gpqa_extended |    30 |  0.2333 | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | qwen    | gpqa             | AveragePass@1   | gpqa_main     |    30 |  0.3    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | qwen    | gpqa             | AveragePass@1   | gpqa_diamond  |    30 |  0.3    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | qwen    | gpqa             | AveragePass@1   | OVERALL       |    90 |  0.2778 | -       |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | qwen    | gsm8k            | AverageAccuracy | main          |    30 |  0.4667 | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+ 
    
  • This model:
  • +---------+------------------+-----------------+---------------+-------+---------+---------+
    | Model   | Dataset          | Metric          | Subset        |   Num |   Score | Cat.0   |
    +=========+==================+=================+===============+=======+=========+=========+
    |ThisModel| arc              | AverageAccuracy | ARC-Easy      |    30 |  0.9    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    |ThisModel| arc              | AverageAccuracy | ARC-Challenge |    30 |  0.8    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    |ThisModel| arc              | AverageAccuracy | OVERALL       |    60 |  0.85   | -       |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    |ThisModel| competition_math | AveragePass@1   | Level 1       |    30 |  0.9    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    |ThisModel| competition_math | AveragePass@1   | Level 2       |    30 |  0.9    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    |ThisModel| competition_math | AveragePass@1   | Level 3       |    30 |  0.8    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    |ThisModel| competition_math | AveragePass@1   | Level 4       |    30 |  0.7333 | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    |ThisModel| competition_math | AveragePass@1   | Level 5       |    30 |  0.4667 | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    |ThisModel| competition_math | AveragePass@1   | OVERALL       |   150 |  0.76   | -       |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    |ThisModel| gpqa             | AveragePass@1   | gpqa_extended |    30 |  0.3333 | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    |ThisModel| gpqa             | AveragePass@1   | gpqa_main     |    30 |  0.3    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    |ThisModel| gpqa             | AveragePass@1   | gpqa_diamond  |    30 |  0.3333 | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    |ThisModel| gpqa             | AveragePass@1   | OVERALL       |    90 |  0.3222 | -       |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    |ThisModel| gsm8k            | AverageAccuracy | main          |    30 |  0.8    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+ 
    
  • Qwen/Qwen3-4B
  • +---------+------------------+-----------------+---------------+-------+---------+---------+
    | Model   | Dataset          | Metric          | Subset        |   Num |   Score | Cat.0   |
    +=========+==================+=================+===============+=======+=========+=========+
    | Qwen3   | arc              | AverageAccuracy | ARC-Easy      |    30 |  0.9    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | Qwen3   | arc              | AverageAccuracy | ARC-Challenge |    30 |  0.8    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | Qwen3   | arc              | AverageAccuracy | OVERALL       |    60 |  0.85   | -       |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | Qwen3   | competition_math | AveragePass@1   | Level 1       |    30 |  0.3    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | Qwen3   | competition_math | AveragePass@1   | Level 2       |    30 |  0.2667 | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | Qwen3   | competition_math | AveragePass@1   | Level 3       |    30 |  0.1333 | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | Qwen3   | competition_math | AveragePass@1   | Level 4       |    30 |  0.2    | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | Qwen3   | competition_math | AveragePass@1   | Level 5       |    30 |  0      | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | Qwen3   | competition_math | AveragePass@1   | OVERALL       |   150 |  0.18   | -       |
    +---------+------------------+-----------------+---------------+-------+---------+---------+ 
    | Qwen3   | gpqa             | AveragePass@1   | gpqa_extended |    50 |    0.32 | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | Qwen3   | gpqa             | AveragePass@1   | gpqa_main     |    50 |    0.22 | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | Qwen3   | gpqa             | AveragePass@1   | gpqa_diamond  |    50 |    0.18 | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | Qwen3   | gpqa             | AveragePass@1   | OVERALL       |   150 |    0.24 | -       |
    +---------+------------------+-----------------+---------------+-------+---------+---------+
    | Qwen3   | gsm8k            | AverageAccuracy | main          |    50 |    0.48 | default |
    +---------+------------------+-----------------+---------------+-------+---------+---------+ 
    
  • You could see that this model have better performence at math and inference.
  • arc: 0.75 --> 9.85
  • competition_math: 0.1533 --> 0.76
  • gpqa: 0.2778 --> 0.3222
  • gsm8k: 0.4667 --> 0.8

Use This Model:

  • from transformers import AutoModelForCausalLM, AutoTokenizer,TextStreamer
    
    model_name = "wesjos/Qwen3-4B-math"
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        torch_dtype="auto",
        device_map="auto"
    )
    
    prompt = "设 f(x) 是一个定义在实数集上的可微函数,满足以下条件:f(0)=1对于所有实数 x有 f′(x)=2f(x)+3。求 f(x)的显式表达式。"
    messages = [
        {"role": "user", "content": prompt}
    ]
    
    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True,
        enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
    )
    
    model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
    
    text_streamer = TextStreamer(tokenizer)
    _ = model.generate(**model_inputs, streamer = text_streamer, max_new_tokens = 2048)

Downloads last month
106
Safetensors
Model size
4.02B params
Tensor type
BF16
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for wesjos/Qwen3-4B-math

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Finetuned
unsloth/Qwen3-4B
Finetuned
(328)
this model
Quantizations
2 models

Datasets used to train wesjos/Qwen3-4B-math