DeepSeek-R1-Distill-Qwen-32B-for-lean
✅ Training Status: COMPLETED
This model is a fine-tuned version of deepseek-ai/DeepSeek-R1-Distill-Qwen-32B, specifically optimized for mathematical reasoning and Lean theorem proving tasks.
🎯 Model Overview
This model has been fine-tuned using LoRA (Low-Rank Adaptation) technique on a custom dataset containing mathematical problems and Lean proofs, enhancing the base model's capabilities in formal mathematics and theorem proving.
📊 Training Results
Metric | Value |
---|---|
Final Training Step | 4,200 (early stopped from 8,550) |
Best Validation Loss | 0.516 |
Training Epochs | ~1.0 (0.98) |
Early Stopping | ✅ Applied (patience=2) |
Model Selection | Best checkpoint automatically selected |
🔧 Training Configuration
Hyperparameters
- Learning Rate: 2e-4
- Batch Size: 6 per device
- Gradient Accumulation: 1 step
- Warmup Steps: 5
- Weight Decay: 0.01
- LR Scheduler: linear
- Optimizer: paged_adamw_8bit
- Precision: bfloat16
LoRA Configuration
- LoRA Rank: 32
- LoRA Alpha: 32
- Target Modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
- Dropout: 0.05
- Max Sequence Length: 4096
Dataset
- Size: 25,650 examples
- Type: Mathematical reasoning and Lean theorem proving
- Preprocessing: Custom formatting for question-answer pairs
- Validation Split: 10%
🚀 Usage
Quick Start
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Load base model and tokenizer
base_model_id = "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"
base_model = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
# Load fine-tuned LoRA adapter
model = PeftModel.from_pretrained(base_model, "Chattso-GPT/DeepSeek-R1-Distill-Qwen-32B-for-lean")
# Generate response
def generate_response(prompt, max_length=512):
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_length=max_length,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# Example usage
prompt = "Prove that the sum of two even numbers is even."
response = generate_response(prompt)
print(response)
For Lean Theorem Proving
# Lean-specific prompt format
lean_prompt = """
theorem sum_of_evens_is_even (a b : ℤ) (ha : even a) (hb : even b) : even (a + b) := by
sorry
"""
proof = generate_response(f"Complete this Lean proof:\n{lean_prompt}")
print(proof)
🎯 Intended Use Cases
- Mathematical Reasoning: Solving complex mathematical problems
- Lean Theorem Proving: Generating and completing formal proofs
- Educational Support: Assisting with mathematics education
- Research: Supporting formal verification research
⚠️ Limitations
- Domain Specific: Optimized primarily for mathematics and Lean
- LoRA Adaptation: Requires base model for full functionality
- Computational Requirements: Needs significant GPU memory for inference
- Training Data: Performance limited to training data distribution
📈 Performance Notes
This model achieved optimal performance through early stopping at step 4,200, preventing overfitting while maintaining strong performance on validation data. The relatively low validation loss (0.516) indicates good generalization on mathematical reasoning tasks.
🔄 Model Versions
- Current: Step 4,200 (Best model with early stopping)
- Architecture: DeepSeek-R1-Distill-Qwen-32B + LoRA
- Format: PEFT-compatible LoRA adapter
📚 Citation
If you use this model in your research, please cite:
@misc{deepseek-r1-distill-lean-2025,
title={DeepSeek-R1-Distill-Qwen-32B-for-lean},
author={Chattso-GPT},
year={2025},
howpublished={\\url{https://huggingface.co/Chattso-GPT/DeepSeek-R1-Distill-Qwen-32B-for-lean}},
}
📄 License
This model is released under the Apache 2.0 License, following the base model's licensing terms.
🤝 Acknowledgments
- Base model: DeepSeek AI for the foundation model
- Training framework: Unsloth for efficient fine-tuning
- Community: Hugging Face for model hosting and tools
📞 Contact
For questions or issues regarding this model, please open an issue in the model repository or contact the model author.
Training completed with early stopping at optimal performance. Model ready for mathematical reasoning and Lean theorem proving tasks.
Model tree for Chattso-GPT/DeepSeek-R1-Distill-Qwen-32B-for-lean
Base model
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B