Model Card: TinyLlama Math Reasoning Assistant(raw)
Model Description
This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0 specialized for solving mathematical problems with step-by-step reasoning. It uses a structured format with XML tags to separate reasoning steps from the final answer.
Model Details
- Base Model: TinyLlama-1.1B-Chat-v1.0
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Training Approach: Supervised Fine-Tuning followed by GRPO (Group Relative Policy Optimization)
- Developed By: pierizvi
- Model Type: Causal Language Model (decoder-only)
- License: MIT
- Repository: Pierizvi/infused-reasoning-tinyllama-math
Intended Use
This model is designed to:
- Solve basic arithmetic problems
- Handle word problems requiring multi-step reasoning
- Show step-by-step calculation processes
- Verify calculation correctness
- Provide structured responses with reasoning and answers
The model is intended for educational purposes, homework assistance, and demonstrating mathematical reasoning processes.
Response Format
The model generates responses in a consistent format:
<reasoning>
Step-by-step solution process with calculations
</reasoning>
<answer>The final answer</answer>
This structured format makes it easy to distinguish between the reasoning process and the final answer.
Capabilities
The model can handle:
- Basic arithmetic operations (addition, subtraction, multiplication, division)
- Word problems involving rates, proportions, and percentages
- Multi-step reasoning problems
- Problems requiring formula application (e.g., distance = speed ร time)
Training Methodology
The model was trained in multiple phases:
- Format Learning: Initial training focused on teaching the model to use the XML tag format
- Calculation Training: Subsequent training focused on accurate mathematical calculations
- Reasoning Enhancement: Final training emphasized natural reasoning approaches before calculations
Training utilized custom reward functions that evaluated:
- Format adherence
- Reasoning quality
- Calculation accuracy
- Mathematical consistency
Limitations
- Small Base Model: As a 1.1B parameter model, it has limited knowledge compared to larger models
- Calculation Complexity: May struggle with complex or multi-step calculations
- Domain Specificity: Primarily focused on elementary and middle-school level mathematics
- Memory Requirements: Requires careful memory optimization when running on CPU environments
Usage Examples
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
import re
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"TinyLlama/TinyLlama-1.1B-Chat-v1.0",
torch_dtype=torch.float16,
device_map="auto"
)
# Load tokenizer and adapter
tokenizer = AutoTokenizer.from_pretrained("Pierizvi/infused-reasoning-tinyllama-math")
model = PeftModel.from_pretrained(base_model, "Pierizvi/infused-reasoning-tinyllama-math")
# Define function to solve problems
def solve_problem(question):
system_prompt = """You MUST respond using ONLY this exact format:
<reasoning>
Think through the problem step by step.
Show your calculations clearly.
</reasoning>
<answer>Your final answer here</answer>"""
prompt = f"{system_prompt}\n\nQuestion: {question}\n\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
input_ids=inputs["input_ids"],
max_new_tokens=256,
temperature=0.5
)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
return response
# Example usage
print(solve_problem("If a train travels at 60 miles per hour for 2.5 hours, how far will it travel?"))
Technical Specifications
LoRA Configuration:
- Rank: 16
- Alpha: 32
- Target Modules: q_proj, k_proj, v_proj, o_proj
- Dropout: 0.05
Memory Optimization:
- Works with 4-bit and 8-bit quantization for CPU deployment
- Can be run on spaces with 16GB RAM with proper optimization
Ethical Considerations
- The model should not be used for high-stakes mathematical calculations where errors could have serious consequences
- The reasoning steps should be verified by users and not taken as definitive mathematical proofs
- The model is intended as a learning aid and not a replacement for proper mathematical education
Acknowledgments
- Thanks to the TinyLlama team for the base model
- Inspired by DeepSeek's approaches to reasoning model training
- Developed using the Hugging Face ecosystem and PEFT library
Citation
If you use this model in research, please cite:
@misc{infused-reasoning-tinyllama-math,
author = {pierizvi},
title = {TinyLlama Math Reasoning Assistant},
year = {2025},
publisher = {Hugging Face},
journal = {Hugging Face Model Hub},
howpublished = {\url{https://huggingface.co/Pierizvi/infused-reasoning-tinyllama-math}}
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support