DeepSeek-R1-Distill-Qwen-7B Fine-tuned for AIMO Math Problems

This model is a fine-tuned version of deepseek-ai/DeepSeek-R1-Distill-Qwen-7B on the Floppanacci/QWQ-LongCOT-AIMO dataset.

Model Description

The model was fine-tuned to improve performance on mathematical reasoning tasks, particularly those involving step-by-step solutions (Chain-of-Thought) similar to problems found in the AI Mathematical Olympiad (AIMO) competition.

It's trained on a dataset containing ~30k math questions paired with detailed solutions.

An AWQ quantized version is also available for faster inference and reduced memory usage.

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "Floppanacci/DeepSeek-R1-Distill-Qwen-7B-Floppanacci"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16, # or torch.float16
    device_map="auto"
)

# Example Prompt (adjust based on how the model expects input)
prompt = "Question: What is the value of $2+2$? Answer:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate
outputs = model.generate(**inputs, max_new_tokens=8192, temperature=0.7, do_sample=True)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(response)

Training Data

The model was fine-tuned on the train split of the Floppanacci/QWQ-LongCOT-AIMO dataset (29.5k examples).

Downloads last month
2
Safetensors
Model size
7.62B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Floppanacci/DeepSeek-R1-Distill-Qwen-7B-Floppanacci

Finetuned
(96)
this model
Quantizations
1 model

Dataset used to train Floppanacci/DeepSeek-R1-Distill-Qwen-7B-Floppanacci

Collection including Floppanacci/DeepSeek-R1-Distill-Qwen-7B-Floppanacci