metadata
license: apache-2.0
base_model: unsloth/Qwen3-1.7B
tags:
- unsloth
- qwen3
- mathematical-reasoning
- sft
- anti-overfitting
language:
- en
pipeline_tag: text-generation
library_name: transformers
Qwen3-1.7B Math SFT - Anti-Overfitting Version
Trained with anti-overfitting measures based on "A Practical Two-Stage Recipe for Mathematical LLMs" paper.
Training Details
- Base Model: unsloth/Qwen3-1.7B
- Parameters: 1,720,032,256 (all fine-tuned)
- Epochs: 10
- Batch Size: 8
- Learning Rate: 5e-06 (reduced for stability)
- Weight Decay: 0.1 (increased regularization)
- Approach: Full model training with anti-overfitting measures
Anti-Overfitting Measures
- Reduced learning rate: 5e-06
- Increased weight decay: 0.1
- Extended warmup: 10% of steps
- Early stopping on validation loss
- Regular evaluation checkpoints
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"Cbgcbg/qwen3-1.7b-math-sft-antioverfitting-20250724_165951",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Cbgcbg/qwen3-1.7b-math-sft-antioverfitting-20250724_165951")
messages = [
{"role": "system", "content": "Please reason step by step, and put your final answer within \boxed{}."},
{"role": "user", "content": "What is 2+2?"}
]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt")
outputs = model.generate(input_ids=inputs, max_new_tokens=256)
Training timestamp: 20250724_165951