Model Card: TinyLlama Math Reasoning Assistant(raw)

Model Description

This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0 specialized for solving mathematical problems with step-by-step reasoning. It uses a structured format with XML tags to separate reasoning steps from the final answer.

Model Details

  • Base Model: TinyLlama-1.1B-Chat-v1.0
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Training Approach: Supervised Fine-Tuning followed by GRPO (Group Relative Policy Optimization)
  • Developed By: pierizvi
  • Model Type: Causal Language Model (decoder-only)
  • License: MIT
  • Repository: Pierizvi/infused-reasoning-tinyllama-math

Intended Use

This model is designed to:

  • Solve basic arithmetic problems
  • Handle word problems requiring multi-step reasoning
  • Show step-by-step calculation processes
  • Verify calculation correctness
  • Provide structured responses with reasoning and answers

The model is intended for educational purposes, homework assistance, and demonstrating mathematical reasoning processes.

Response Format

The model generates responses in a consistent format:

<reasoning>
Step-by-step solution process with calculations
</reasoning>
<answer>The final answer</answer>

This structured format makes it easy to distinguish between the reasoning process and the final answer.

Capabilities

The model can handle:

  • Basic arithmetic operations (addition, subtraction, multiplication, division)
  • Word problems involving rates, proportions, and percentages
  • Multi-step reasoning problems
  • Problems requiring formula application (e.g., distance = speed ร— time)

Training Methodology

The model was trained in multiple phases:

  1. Format Learning: Initial training focused on teaching the model to use the XML tag format
  2. Calculation Training: Subsequent training focused on accurate mathematical calculations
  3. Reasoning Enhancement: Final training emphasized natural reasoning approaches before calculations

Training utilized custom reward functions that evaluated:

  • Format adherence
  • Reasoning quality
  • Calculation accuracy
  • Mathematical consistency

Limitations

  • Small Base Model: As a 1.1B parameter model, it has limited knowledge compared to larger models
  • Calculation Complexity: May struggle with complex or multi-step calculations
  • Domain Specificity: Primarily focused on elementary and middle-school level mathematics
  • Memory Requirements: Requires careful memory optimization when running on CPU environments

Usage Examples

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
import re

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load tokenizer and adapter
tokenizer = AutoTokenizer.from_pretrained("Pierizvi/infused-reasoning-tinyllama-math")
model = PeftModel.from_pretrained(base_model, "Pierizvi/infused-reasoning-tinyllama-math")

# Define function to solve problems
def solve_problem(question):
    system_prompt = """You MUST respond using ONLY this exact format:

<reasoning>
Think through the problem step by step.
Show your calculations clearly.
</reasoning>
<answer>Your final answer here</answer>"""
    
    prompt = f"{system_prompt}\n\nQuestion: {question}\n\n"
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    
    with torch.no_grad():
        outputs = model.generate(
            input_ids=inputs["input_ids"],
            max_new_tokens=256,
            temperature=0.5
        )
    
    response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
    return response

# Example usage
print(solve_problem("If a train travels at 60 miles per hour for 2.5 hours, how far will it travel?"))

Technical Specifications

  • LoRA Configuration:

    • Rank: 16
    • Alpha: 32
    • Target Modules: q_proj, k_proj, v_proj, o_proj
    • Dropout: 0.05
  • Memory Optimization:

    • Works with 4-bit and 8-bit quantization for CPU deployment
    • Can be run on spaces with 16GB RAM with proper optimization

Ethical Considerations

  • The model should not be used for high-stakes mathematical calculations where errors could have serious consequences
  • The reasoning steps should be verified by users and not taken as definitive mathematical proofs
  • The model is intended as a learning aid and not a replacement for proper mathematical education

Acknowledgments

  • Thanks to the TinyLlama team for the base model
  • Inspired by DeepSeek's approaches to reasoning model training
  • Developed using the Hugging Face ecosystem and PEFT library

Citation

If you use this model in research, please cite:

@misc{infused-reasoning-tinyllama-math,
  author = {pierizvi},
  title = {TinyLlama Math Reasoning Assistant},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face Model Hub},
  howpublished = {\url{https://huggingface.co/Pierizvi/infused-reasoning-tinyllama-math}}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support