DeepSeek-R1-Distill-Qwen-32B-for-lean

✅ Training Status: COMPLETED

This model is a fine-tuned version of deepseek-ai/DeepSeek-R1-Distill-Qwen-32B, specifically optimized for mathematical reasoning and Lean theorem proving tasks.

🎯 Model Overview

This model has been fine-tuned using LoRA (Low-Rank Adaptation) technique on a custom dataset containing mathematical problems and Lean proofs, enhancing the base model's capabilities in formal mathematics and theorem proving.

📊 Training Results

Metric	Value
Final Training Step	4,200 (early stopped from 8,550)
Best Validation Loss	0.516
Training Epochs	~1.0 (0.98)
Early Stopping	✅ Applied (patience=2)
Model Selection	Best checkpoint automatically selected

🔧 Training Configuration

Hyperparameters

Learning Rate: 2e-4
Batch Size: 6 per device
Gradient Accumulation: 1 step
Warmup Steps: 5
Weight Decay: 0.01
LR Scheduler: linear
Optimizer: paged_adamw_8bit
Precision: bfloat16

LoRA Configuration

LoRA Rank: 32
LoRA Alpha: 32
Target Modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
Dropout: 0.05
Max Sequence Length: 4096

Dataset

Size: 25,650 examples
Type: Mathematical reasoning and Lean theorem proving
Preprocessing: Custom formatting for question-answer pairs
Validation Split: 10%

🚀 Usage

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model and tokenizer
base_model_id = "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

# Load fine-tuned LoRA adapter
model = PeftModel.from_pretrained(base_model, "Chattso-GPT/DeepSeek-R1-Distill-Qwen-32B-for-lean")

# Generate response
def generate_response(prompt, max_length=512):
    inputs = tokenizer(prompt, return_tensors="pt")
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_length=max_length,
            temperature=0.7,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example usage
prompt = "Prove that the sum of two even numbers is even."
response = generate_response(prompt)
print(response)

For Lean Theorem Proving

# Lean-specific prompt format
lean_prompt = """
theorem sum_of_evens_is_even (a b : ℤ) (ha : even a) (hb : even b) : even (a + b) := by
  sorry
"""

proof = generate_response(f"Complete this Lean proof:\n{lean_prompt}")
print(proof)

🎯 Intended Use Cases

Mathematical Reasoning: Solving complex mathematical problems
Lean Theorem Proving: Generating and completing formal proofs
Educational Support: Assisting with mathematics education
Research: Supporting formal verification research

⚠️ Limitations

Domain Specific: Optimized primarily for mathematics and Lean
LoRA Adaptation: Requires base model for full functionality
Computational Requirements: Needs significant GPU memory for inference
Training Data: Performance limited to training data distribution

📈 Performance Notes

This model achieved optimal performance through early stopping at step 4,200, preventing overfitting while maintaining strong performance on validation data. The relatively low validation loss (0.516) indicates good generalization on mathematical reasoning tasks.

🔄 Model Versions

Current: Step 4,200 (Best model with early stopping)
Architecture: DeepSeek-R1-Distill-Qwen-32B + LoRA
Format: PEFT-compatible LoRA adapter

📚 Citation

If you use this model in your research, please cite:

@misc{deepseek-r1-distill-lean-2025,
  title={DeepSeek-R1-Distill-Qwen-32B-for-lean},
  author={Chattso-GPT},
  year={2025},
  howpublished={\\url{https://huggingface.co/Chattso-GPT/DeepSeek-R1-Distill-Qwen-32B-for-lean}},
}

📄 License

This model is released under the Apache 2.0 License, following the base model's licensing terms.

🤝 Acknowledgments

Base model: DeepSeek AI for the foundation model
Training framework: Unsloth for efficient fine-tuning
Community: Hugging Face for model hosting and tools

📞 Contact

For questions or issues regarding this model, please open an issue in the model repository or contact the model author.

Training completed with early stopping at optimal performance. Model ready for mathematical reasoning and Lean theorem proving tasks.

Chattso-GPT
/

DeepSeek-R1-Distill-Qwen-32B-for-lean