DeepSeek-R1-Distill-Qwen-32B-for-lean

Training Status: COMPLETED

This model is a fine-tuned version of deepseek-ai/DeepSeek-R1-Distill-Qwen-32B, specifically optimized for mathematical reasoning and Lean theorem proving tasks.

🎯 Model Overview

This model has been fine-tuned using LoRA (Low-Rank Adaptation) technique on a custom dataset containing mathematical problems and Lean proofs, enhancing the base model's capabilities in formal mathematics and theorem proving.

📊 Training Results

Metric Value
Final Training Step 4,200 (early stopped from 8,550)
Best Validation Loss 0.516
Training Epochs ~1.0 (0.98)
Early Stopping ✅ Applied (patience=2)
Model Selection Best checkpoint automatically selected

🔧 Training Configuration

Hyperparameters

  • Learning Rate: 2e-4
  • Batch Size: 6 per device
  • Gradient Accumulation: 1 step
  • Warmup Steps: 5
  • Weight Decay: 0.01
  • LR Scheduler: linear
  • Optimizer: paged_adamw_8bit
  • Precision: bfloat16

LoRA Configuration

  • LoRA Rank: 32
  • LoRA Alpha: 32
  • Target Modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
  • Dropout: 0.05
  • Max Sequence Length: 4096

Dataset

  • Size: 25,650 examples
  • Type: Mathematical reasoning and Lean theorem proving
  • Preprocessing: Custom formatting for question-answer pairs
  • Validation Split: 10%

🚀 Usage

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model and tokenizer
base_model_id = "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

# Load fine-tuned LoRA adapter
model = PeftModel.from_pretrained(base_model, "Chattso-GPT/DeepSeek-R1-Distill-Qwen-32B-for-lean")

# Generate response
def generate_response(prompt, max_length=512):
    inputs = tokenizer(prompt, return_tensors="pt")
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_length=max_length,
            temperature=0.7,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example usage
prompt = "Prove that the sum of two even numbers is even."
response = generate_response(prompt)
print(response)

For Lean Theorem Proving

# Lean-specific prompt format
lean_prompt = """
theorem sum_of_evens_is_even (a b : ℤ) (ha : even a) (hb : even b) : even (a + b) := by
  sorry
"""

proof = generate_response(f"Complete this Lean proof:\n{lean_prompt}")
print(proof)

🎯 Intended Use Cases

  • Mathematical Reasoning: Solving complex mathematical problems
  • Lean Theorem Proving: Generating and completing formal proofs
  • Educational Support: Assisting with mathematics education
  • Research: Supporting formal verification research

⚠️ Limitations

  • Domain Specific: Optimized primarily for mathematics and Lean
  • LoRA Adaptation: Requires base model for full functionality
  • Computational Requirements: Needs significant GPU memory for inference
  • Training Data: Performance limited to training data distribution

📈 Performance Notes

This model achieved optimal performance through early stopping at step 4,200, preventing overfitting while maintaining strong performance on validation data. The relatively low validation loss (0.516) indicates good generalization on mathematical reasoning tasks.

🔄 Model Versions

  • Current: Step 4,200 (Best model with early stopping)
  • Architecture: DeepSeek-R1-Distill-Qwen-32B + LoRA
  • Format: PEFT-compatible LoRA adapter

📚 Citation

If you use this model in your research, please cite:

@misc{deepseek-r1-distill-lean-2025,
  title={DeepSeek-R1-Distill-Qwen-32B-for-lean},
  author={Chattso-GPT},
  year={2025},
  howpublished={\\url{https://huggingface.co/Chattso-GPT/DeepSeek-R1-Distill-Qwen-32B-for-lean}},
}

📄 License

This model is released under the Apache 2.0 License, following the base model's licensing terms.

🤝 Acknowledgments

  • Base model: DeepSeek AI for the foundation model
  • Training framework: Unsloth for efficient fine-tuning
  • Community: Hugging Face for model hosting and tools

📞 Contact

For questions or issues regarding this model, please open an issue in the model repository or contact the model author.


Training completed with early stopping at optimal performance. Model ready for mathematical reasoning and Lean theorem proving tasks.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Chattso-GPT/DeepSeek-R1-Distill-Qwen-32B-for-lean

Adapter
(179)
this model