Model Card for Qwen3-0.6B-OpenMathReason

Model Description

This model is fine-tuned version of Qwen/Qwen3-0.6B using the Unsloth library and LoRA for parameter-efficient training. This model is trained on two datasets:

  • unsloth/OpenMathReason-mini — for enhancing mathematical reasoning skills.
  • mlabonne/FineTome-100k — to improve general conversational abilities.

Model Details

  • Developed by: Rustam Shiriyev
  • Language(s) (NLP): English
  • License: MIT
  • Finetuned from model: unsloth/Qwen3-0.6B

Uses

Direct Use

This model can be used as a lightweight assistant capable of solving basic to intermediate math problems (OpenMathReason tasks).

Downstream Use

  • Can be integrated into educational chatbots for STEM learning.

Out-of-Scope Use

  • Not suitable for high-stakes decision-making.

Bias, Risks, and Limitations

  • Mathematical reasoning is limited to the scope of the OpenMathReason-mini dataset.
  • Conversational quality may degrade with complex or multi-turn inputs.

How to Get Started with the Model

from transformers import TextStreamer
from huggingface_hub import login
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel


login(token="")  

tokenizer = AutoTokenizer.from_pretrained("unsloth/Qwen3-0.6B",)
base_model = AutoModelForCausalLM.from_pretrained(
    "unsloth/Qwen3-0.6B",
    device_map="auto", token=""
)

model = PeftModel.from_pretrained(base_model,"Rustamshry/Qwen3-0.6B-OpenMathReason")

question = "Solve (x + 2)^2 = 0"

messages = [
    {"role" : "user", "content" : question}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize = False,
    add_generation_prompt = True, 
    enable_thinking = True,
)

_ = model.generate(
    **tokenizer(text, return_tensors = "pt").to(model.device),
    max_new_tokens = 2048,
    temperature = 0.6, top_p = 0.95, top_k = 20,
    streamer = TextStreamer(tokenizer, skip_prompt = True),
)

Training Details

Training Data

  • unsloth/OpenMathReason-mini: 10k+ instruction-following examples focused on math.
  • mlabonne/FineTome-100k: 100k examples of diverse, high-quality chat data.

Training Procedure

  • batch size=8,
  • gradient accumulation steps=2,
  • optimizer=adamw_torch,
  • learning rate=2e-5,
  • warmup steps=100,
  • fp16=True,
  • dataloader_num_workers=16,
  • num_train_epochs=1,
  • weight_decay=0.01,
  • lr_scheduler_type = "linear"

Results

  • Loss Value >> 0.56

Framework versions

  • PEFT 0.14.0
Downloads last month
0
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Rustamshry/Qwen3-0.6B-OpenMathReason

Finetuned
Qwen/Qwen3-0.6B
Finetuned
unsloth/Qwen3-0.6B
Adapter
(2)
this model

Datasets used to train Rustamshry/Qwen3-0.6B-OpenMathReason