metadata
library_name: transformers
tags:
- math
- qwen2
- aimo
license: mit
datasets:
- Floppanacci/QWQ-LongCOT-AIMO
base_model:
- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
language:
- en
DeepSeek-R1-Distill-Qwen-7B Fine-tuned for AIMO Math Problems
This model is a fine-tuned version of deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
on the Floppanacci/QWQ-LongCOT-AIMO
dataset.
Model Description
The model was fine-tuned to improve performance on mathematical reasoning tasks, particularly those involving step-by-step solutions (Chain-of-Thought) similar to problems found in the AI Mathematical Olympiad (AIMO) competition.
It's trained on a dataset containing ~30k math questions paired with detailed solutions.
An AWQ quantized version is also available for faster inference and reduced memory usage.
How to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "Floppanacci/DeepSeek-R1-Distill-Qwen-7B-Floppanacci"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16, # or torch.float16
device_map="auto"
)
# Example Prompt (adjust based on how the model expects input)
prompt = "Question: What is the value of $2+2$? Answer:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Generate
outputs = model.generate(**inputs, max_new_tokens=8192, temperature=0.7, do_sample=True)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Training Data
The model was fine-tuned on the train split of the Floppanacci/QWQ-LongCOT-AIMO
dataset (29.5k examples).