SmolLM3-3B-Math-Formulas-4bit

Model Description

SmolLM3-3B-Math-Formulas-4bit is a fine-tuned version of HuggingFaceTB/SmolLM3-3B specialized for mathematical formula understanding and generation. The model has been optimized using 4-bit quantization (NF4) with LoRA adapters for efficient training and inference.

Base Model: HuggingFaceTB/SmolLM3-3B
Model Type: Causal Language Model
Quantization: 4-bit NF4 with double quantization
Fine-tuning Method: QLoRA (Quantized Low-Rank Adaptation)
Specialization: Mathematical formulas and expressions

Training Details

Dataset

Source: ddrg/math_formulas
Size: 1,000 samples (randomly selected from 2.89M total)
Content: Mathematical formulas, equations, and expressions in LaTeX format

Training Configuration

Training Loss: 0.589 (final)
Epochs: 6
Batch Size: 8 (per device)
Learning Rate: 2.5e-4 with cosine scheduler
Max Sequence Length: 128 tokens
Gradient Accumulation: 2 steps
Optimizer: AdamW with 0.01 weight decay
Precision: FP16
LoRA Configuration:
- r=4, alpha=8
- Dropout: 0.1
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Hardware & Performance

Training Time: 265 seconds (4.4 minutes)
Training Speed: 5.68 samples/second
Total Steps: 96
Memory Efficiency: 4-bit quantization for reduced VRAM usage

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load the model and tokenizer
model_name = "sweatSmile/HF-SmolLM3-3B-Math-Formulas-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Generate mathematical content
prompt = "Explain this mathematical formula:"
inputs = tokenizer(prompt, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=150,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Intended Use Cases

Mathematical Education: Explaining mathematical formulas and concepts
LaTeX Generation: Creating properly formatted mathematical expressions
Formula Analysis: Understanding and breaking down complex mathematical equations
Mathematical Problem Solving: Assisting with mathematical computations and derivations

Limitations

Domain Specific: Optimized primarily for mathematical content
Training Data Size: Fine-tuned on only 1,000 samples
Quantization Effects: 4-bit quantization may introduce minor precision loss
Context Length: Limited to 128 tokens for mathematical expressions
Language: Primarily trained on English mathematical notation

Performance Metrics

Final Training Loss: 0.589
Convergence: Achieved in 6 epochs (efficient training)
Improvement: 52% loss reduction compared to baseline configuration
Efficiency: 51% faster training compared to initial setup

Model Architecture

Based on SmolLM3-3B with the following modifications:

4-bit NF4 quantization for memory efficiency
LoRA adapters for parameter-efficient fine-tuning
Specialized for mathematical formula understanding

Citation

If you use this model, please cite:

@model{smollm3-math-formulas-4bit,
  title={SmolLM3-3B-Math-Formulas-4bit},
  author={sweatSmile},
  year={2025},
  base_model={HuggingFaceTB/SmolLM3-3B},
  dataset={ddrg/math_formulas},
  method={QLoRA fine-tuning with 4-bit quantization}
}

License

This model inherits the license from the base SmolLM3-3B model. Please refer to the original model's license for usage terms.

Acknowledgments

Base Model: HuggingFace Team for SmolLM3-3B
Dataset: Dresden Database Research Group for the math_formulas dataset
Training Framework: Hugging Face Transformers and TRL libraries
Quantization: bitsandbytes library for 4-bit optimization

sweatSmile
/

HF-SmolLM3-3B-Math-Formulas-4bit