Model Card for Qwen3-0.6B-OpenMathReason
Model Description
This model is fine-tuned version of Qwen/Qwen3-0.6B using the Unsloth library and LoRA for parameter-efficient training. This model is trained on two datasets:
- unsloth/OpenMathReason-mini — for enhancing mathematical reasoning skills.
- mlabonne/FineTome-100k — to improve general conversational abilities.
Model Details
- Developed by: Rustam Shiriyev
- Language(s) (NLP): English
- License: MIT
- Finetuned from model: unsloth/Qwen3-0.6B
Uses
Direct Use
This model can be used as a lightweight assistant capable of solving basic to intermediate math problems (OpenMathReason tasks).
Downstream Use
- Can be integrated into educational chatbots for STEM learning.
Out-of-Scope Use
- Not suitable for high-stakes decision-making.
Bias, Risks, and Limitations
- Mathematical reasoning is limited to the scope of the OpenMathReason-mini dataset.
- Conversational quality may degrade with complex or multi-turn inputs.
How to Get Started with the Model
from transformers import TextStreamer
from huggingface_hub import login
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
login(token="")
tokenizer = AutoTokenizer.from_pretrained("unsloth/Qwen3-0.6B",)
base_model = AutoModelForCausalLM.from_pretrained(
"unsloth/Qwen3-0.6B",
device_map="auto", token=""
)
model = PeftModel.from_pretrained(base_model,"Rustamshry/Qwen3-0.6B-OpenMathReason")
question = "Solve (x + 2)^2 = 0"
messages = [
{"role" : "user", "content" : question}
]
text = tokenizer.apply_chat_template(
messages,
tokenize = False,
add_generation_prompt = True,
enable_thinking = True,
)
_ = model.generate(
**tokenizer(text, return_tensors = "pt").to(model.device),
max_new_tokens = 2048,
temperature = 0.6, top_p = 0.95, top_k = 20,
streamer = TextStreamer(tokenizer, skip_prompt = True),
)
Training Details
Training Data
- unsloth/OpenMathReason-mini: 10k+ instruction-following examples focused on math.
- mlabonne/FineTome-100k: 100k examples of diverse, high-quality chat data.
Training Procedure
- batch size=8,
- gradient accumulation steps=2,
- optimizer=adamw_torch,
- learning rate=2e-5,
- warmup steps=100,
- fp16=True,
- dataloader_num_workers=16,
- num_train_epochs=1,
- weight_decay=0.01,
- lr_scheduler_type = "linear"
Results
- Loss Value >> 0.56
Framework versions
- PEFT 0.14.0
- Downloads last month
- 0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support