---
license: apache-2.0
base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
tags:
- generated_from_trainer
- mathematics
- lean
- theorem-proving
- reasoning
- fine-tuned
- lora
model-index:
- name: DeepSeek-R1-Distill-Qwen-32B-for-lean
  results: []
datasets:
- custom_lean_mathematics
language:
- en
pipeline_tag: text-generation
---

# DeepSeek-R1-Distill-Qwen-32B-for-lean

✅ **Training Status: COMPLETED**

This model is a fine-tuned version of [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B), specifically optimized for mathematical reasoning and Lean theorem proving tasks.

## 🎯 Model Overview

This model has been fine-tuned using LoRA (Low-Rank Adaptation) technique on a custom dataset containing mathematical problems and Lean proofs, enhancing the base model's capabilities in formal mathematics and theorem proving.

## 📊 Training Results

| Metric | Value |
|--------|--------|
| **Final Training Step** | 4,200 (early stopped from 8,550) |
| **Best Validation Loss** | 0.516 |
| **Training Epochs** | ~1.0 (0.98) |
| **Early Stopping** | ✅ Applied (patience=2) |
| **Model Selection** | Best checkpoint automatically selected |

## 🔧 Training Configuration

### Hyperparameters
- **Learning Rate**: 2e-4
- **Batch Size**: 6 per device
- **Gradient Accumulation**: 1 step
- **Warmup Steps**: 5
- **Weight Decay**: 0.01
- **LR Scheduler**: linear
- **Optimizer**: paged_adamw_8bit
- **Precision**: bfloat16

### LoRA Configuration
- **LoRA Rank**: 32
- **LoRA Alpha**: 32
- **Target Modules**: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
- **Dropout**: 0.05
- **Max Sequence Length**: 4096

### Dataset
- **Size**: 25,650 examples
- **Type**: Mathematical reasoning and Lean theorem proving
- **Preprocessing**: Custom formatting for question-answer pairs
- **Validation Split**: 10%

## 🚀 Usage

### Quick Start

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model and tokenizer
base_model_id = "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

# Load fine-tuned LoRA adapter
model = PeftModel.from_pretrained(base_model, "Chattso-GPT/DeepSeek-R1-Distill-Qwen-32B-for-lean")

# Generate response
def generate_response(prompt, max_length=512):
    inputs = tokenizer(prompt, return_tensors="pt")
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_length=max_length,
            temperature=0.7,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example usage
prompt = "Prove that the sum of two even numbers is even."
response = generate_response(prompt)
print(response)
```

### For Lean Theorem Proving

```python
# Lean-specific prompt format
lean_prompt = """
theorem sum_of_evens_is_even (a b : ℤ) (ha : even a) (hb : even b) : even (a + b) := by
  sorry
"""

proof = generate_response(f"Complete this Lean proof:\n{lean_prompt}")
print(proof)
```

## 🎯 Intended Use Cases

- **Mathematical Reasoning**: Solving complex mathematical problems
- **Lean Theorem Proving**: Generating and completing formal proofs
- **Educational Support**: Assisting with mathematics education
- **Research**: Supporting formal verification research

## ⚠️ Limitations

- **Domain Specific**: Optimized primarily for mathematics and Lean
- **LoRA Adaptation**: Requires base model for full functionality
- **Computational Requirements**: Needs significant GPU memory for inference
- **Training Data**: Performance limited to training data distribution

## 📈 Performance Notes

This model achieved optimal performance through early stopping at step 4,200, preventing overfitting while maintaining strong performance on validation data. The relatively low validation loss (0.516) indicates good generalization on mathematical reasoning tasks.

## 🔄 Model Versions

- **Current**: Step 4,200 (Best model with early stopping)
- **Architecture**: DeepSeek-R1-Distill-Qwen-32B + LoRA
- **Format**: PEFT-compatible LoRA adapter

## 📚 Citation

If you use this model in your research, please cite:

```bibtex
@misc{deepseek-r1-distill-lean-2025,
  title={DeepSeek-R1-Distill-Qwen-32B-for-lean},
  author={Chattso-GPT},
  year={2025},
  howpublished={\\url{https://huggingface.co/Chattso-GPT/DeepSeek-R1-Distill-Qwen-32B-for-lean}},
}
```

## 📄 License

This model is released under the Apache 2.0 License, following the base model's licensing terms.

## 🤝 Acknowledgments

- Base model: [DeepSeek AI](https://huggingface.co/deepseek-ai) for the foundation model
- Training framework: Unsloth for efficient fine-tuning
- Community: Hugging Face for model hosting and tools

## 📞 Contact

For questions or issues regarding this model, please open an issue in the model repository or contact the model author.

---

*Training completed with early stopping at optimal performance. Model ready for mathematical reasoning and Lean theorem proving tasks.*