metadata
license: mit
language:
- en
tags:
- generated_from_trainer
- trl
- sft
- casual-lm
- text-generation
library_name: transformers
Reasoning-Small-1B
A compact and efficient 1B parameter model fine-tuned for reasoning tasks using extracted Chain-of-Thought (CoT) data.
Training Details
- Base Model: LLaMA 3.1 (1B parameters)
- Dataset: 2.67M extracted Chain-of-Thought examples
Benchmarking
Benchmark | Performance |
---|---|
GSM8K (Math) | 70.8% |
ARC (Reasoning) | 69.4% |
Math(CoT) | 40.8% |
License
This repository is released under the MIT License.