LIMO-Qwen3-8B-Math

This model is fine-tuned on the LIMO dataset for mathematical reasoning tasks.

Model Details

Base Model: Qwen3-8B (4-bit quantized)
Training Method: LoRA fine-tuning with Unsloth
Dataset: GAIR/LIMO (817 high-quality samples)
Training Framework: Unsloth + SFTTrainer
Sequence Length: 4096 tokens

Training Configuration

Batch Size: 8
Gradient Accumulation: 1
Learning Rate: 2e-5
Epochs: 3
LoRA Rank: 16
LoRA Alpha: 32
LoRA Dropout: 0.1

Performance

This model follows the LIMO (Less is More) approach, achieving strong mathematical reasoning performance with minimal but high-quality training data.

Usage

from unsloth import FastLanguageModel
import torch

# Load model
model, tokenizer = FastLanguageModel.from_pretrained(
    "Cbgcbg/limo-qwen3-8b-math",
    max_seq_length=4096,
    dtype=torch.bfloat16,
    load_in_4bit=True,
)

# Enable inference mode
FastLanguageModel.for_inference(model)

# Format input
messages = [
    {"role": "system", "content": "Please reason step by step, and put your final answer within \\boxed{}."},
    {"role": "user", "content": "What is the sum of the first 10 positive integers?"}
]

formatted_prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)

# Generate
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)

System Prompt

The model was trained with the following system prompt:

Please reason step by step, and put your final answer within \boxed{}.

Citation

If you use this model, please cite the original LIMO paper:

@misc{ye2025limoreasoning,
    title={LIMO: Less is More for Reasoning},
    author={Yixin Ye and Zhen Huang and Yang Xiao and Ethan Chern and Shijie Xia and Pengfei Liu},
    year={2025},
    eprint={2502.03387},
    archivePrefix={arXiv},
    primaryClass={cs.CL},
    url={https://arxiv.org/abs/2502.03387},
}

Training Details

This model was trained using the LIMO methodology, which demonstrates that high-quality mathematical reasoning can be achieved with minimal but carefully curated training data.

Limitations

Optimized specifically for mathematical reasoning tasks
May not perform as well on general conversation tasks
Requires proper system prompt for optimal performance

Cbgcbg
/

limo-qwen3-8b-math