MathGPT-2 Kaly (distilgpt2 Fine-Tuned for Arithmetic and basic Geometry)
This model is a fine-tuned version of DistilGPT-2 on a custom dataset consisting exclusively of arithmetic and geometric problems and their answers. The goal of this model is to act as a calculator that can solve basic arithmetic problems.
Benchmark
Link Not avaliable.
Model Description
The model was trained using a dataset of decent arithmetic expressions, including addition, subtraction, multiplication, division, powers root and geometric. The training data was generated using Python and ensured to have no duplicate expressions.
Key Features:
- Solves basic arithmetic (addition, subtraction, multiplication, division, powers root, geometric)
- Can handle simple problems like
12 + 5 =
- Fine-tuned version of
distilgpt2
on a math-specific dataset
- Trained for 1 epochs (further improvements can be made by training for more epochs)
Model Details
- Model architecture: DistilGPT-2
- Training duration: 1 epochs (could be improved further)
- Dataset: Generated math expressions like
12 + 5 = 17
- Tokenization: Standard GPT-2 tokenizer
- Fine-tuned on: Simple arithmetic operations
Intended Use
This model is designed to:
- Answer basic arithmetic problems (addition, subtraction, multiplication, division, powers root, geometric).
- It can generate answers for simple problems like
12 * 6 = ?
.
Example:
Input:
13 + 47 =
Output:
60
Fine-Tuning
This model was fine-tuned from the distilgpt2
base model for 1 epochs.
Limitations
- Basic Arithmetic Only: The model can only handle basic arithmetic problems like addition, subtraction, multiplication, division, powers root, simple geometric. It does not handle more complex operations like exponentiation, logarithms, or advanced algebra.
- Limited Training Duration: While trained for 10 epochs, more epochs or data diversity may improve the model's performance further.
- No real-time validation: The model's performance varies, and there are still inaccuracies in answers for some problems.