krishanwalia30/Qwen3-16bit-OpenMathReasoning-Finetuned-Merged

๐Ÿš€ Harness the Power of Qwen-3 with Enhanced Reasoning and Chat! ๐Ÿš€

This model is a carefully fine-tuned version of the incredible Qwen-3-8B using cutting-edge techniques with Unsloth and Parameter-Efficient Fine-Tuning (PEFT) via LoRA. It's designed to bring you the best of both worlds: the strong general capabilities of Qwen-3 with a significant boost in logical reasoning and engaging conversational skills.

We've taken the already powerful Qwen-3 and further sculpted it using a blend of the unsloth/OpenMathReasoning-mini (Chain-of-Thought split) for advanced problem-solving and the mlabonne/FineTome-100k dataset to ensure natural and fluent interactions.

๐Ÿ”ฅ Key Features:

  • Enhanced Reasoning: Excels at tasks requiring logical deduction and step-by-step thinking, thanks to fine-tuning on a dedicated reasoning dataset.
  • Improved Chat: Maintains and enhances the general conversational abilities of Qwen-3, making it great for interactive applications.
  • Efficient Fine-Tuning: Built using the incredibly efficient Unsloth library, resulting in faster training with less memory usage.
  • PEFT (LoRA) Inside: Leverages Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning, making it easier to adapt to specific tasks without full model retraining.
  • Ready to Use: Seamlessly integrates with the transformers library.

๐Ÿ› ๏ธ How to Get Started:

Install the necessary libraries:

pip install transformers accelerate torch

Load and use the model:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "krishanwalia30/Qwen3-16bit-OpenMathReasoning-Finetuned-Merged"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="torch.float16")

messages = [
    {"role": "user", "content": "Explain the Pythagorean theorem in simple terms."},
    {"role": "assistant", "content": "Okay, here's a simple explanation:"},
    {"role": "user", "content": "Now, solve for the hypotenuse if a=3 and b=4."},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7, top_p=0.8, top_k=20, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

โš™๏ธ Fine-tuning Details:

  • Base Model: Qwen-3-8B
  • Fine-tuning Framework: Unsloth
  • PEFT Strategy: LoRA
  • Training Datasets:
  • Training Ratio: Approximately 30% reasoning data and 70% general chat data to balance capabilities.
  • Training Infrastructure: Google Colab with a T4 GPU.
  • Quantization during Training: Likely 4-bit quantization was employed during the fine-tuning process using Unsloth for memory efficiency. The final merged model is saved in 16-bit for broader compatibility.
  • Key Hyperparameters:
    • per_device_train_batch_size: 2
    • gradient_accumulation_steps: 4
    • learning_rate: 2e-4
    • max_steps: 30
    • Optimizer: adamw_8bit
    • Learning Rate Scheduler: linear
    • Warmup Steps: 5
    • Weight Decay: 0.01
    • Seed: 3407

๐Ÿ“Š Evaluation:

While rigorous quantitative evaluations are ongoing, initial assessments indicate a significant improvement in the model's ability to handle reasoning-based questions while maintaining strong general conversational skills. Further benchmarks and community feedback are welcome!

๐Ÿ‘จโ€๐Ÿ’ป Author:

[https://huggingface.co/krishanwalia30]

๐Ÿ”— Learn More:

For a deeper dive into the fine-tuning process and the rationale behind the choices, check out the article: [https://medium.com/@krishanw30/b1a8f684c3f1].

๐Ÿ™ Acknowledgements:

A big thank you to the brilliant teams at Qwen, Unsloth AI, and the creators of the OpenMathReasoning-mini and FineTome-100k datasets for making this project possible!

Uploaded model

  • Developed by: krishanwalia30
  • License: apache-2.0
  • Finetuned from model : unsloth/Qwen3-8B-unsloth-bnb-4bit

This qwen3 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
27
Safetensors
Model size
8.19B params
Tensor type
FP16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for krishanwalia30/Qwen3-16bit-OpenMathReasoning-Finetuned-Merged

Base model

Qwen/Qwen3-8B-Base
Finetuned
Qwen/Qwen3-8B
Adapter
(4)
this model