--- license: apache-2.0 base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B tags: - generated_from_trainer - mathematics - lean - theorem-proving - reasoning - fine-tuned - lora model-index: - name: DeepSeek-R1-Distill-Qwen-32B-for-lean results: [] datasets: - custom_lean_mathematics language: - en pipeline_tag: text-generation --- # DeepSeek-R1-Distill-Qwen-32B-for-lean ✅ **Training Status: COMPLETED** This model is a fine-tuned version of [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B), specifically optimized for mathematical reasoning and Lean theorem proving tasks. ## 🎯 Model Overview This model has been fine-tuned using LoRA (Low-Rank Adaptation) technique on a custom dataset containing mathematical problems and Lean proofs, enhancing the base model's capabilities in formal mathematics and theorem proving. ## 📊 Training Results | Metric | Value | |--------|--------| | **Final Training Step** | 4,200 (early stopped from 8,550) | | **Best Validation Loss** | 0.516 | | **Training Epochs** | ~1.0 (0.98) | | **Early Stopping** | ✅ Applied (patience=2) | | **Model Selection** | Best checkpoint automatically selected | ## 🔧 Training Configuration ### Hyperparameters - **Learning Rate**: 2e-4 - **Batch Size**: 6 per device - **Gradient Accumulation**: 1 step - **Warmup Steps**: 5 - **Weight Decay**: 0.01 - **LR Scheduler**: linear - **Optimizer**: paged_adamw_8bit - **Precision**: bfloat16 ### LoRA Configuration - **LoRA Rank**: 32 - **LoRA Alpha**: 32 - **Target Modules**: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"] - **Dropout**: 0.05 - **Max Sequence Length**: 4096 ### Dataset - **Size**: 25,650 examples - **Type**: Mathematical reasoning and Lean theorem proving - **Preprocessing**: Custom formatting for question-answer pairs - **Validation Split**: 10% ## 🚀 Usage ### Quick Start ```python from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel import torch # Load base model and tokenizer base_model_id = "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B" base_model = AutoModelForCausalLM.from_pretrained( base_model_id, torch_dtype=torch.bfloat16, device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(base_model_id) # Load fine-tuned LoRA adapter model = PeftModel.from_pretrained(base_model, "Chattso-GPT/DeepSeek-R1-Distill-Qwen-32B-for-lean") # Generate response def generate_response(prompt, max_length=512): inputs = tokenizer(prompt, return_tensors="pt") with torch.no_grad(): outputs = model.generate( **inputs, max_length=max_length, temperature=0.7, do_sample=True, pad_token_id=tokenizer.eos_token_id ) return tokenizer.decode(outputs[0], skip_special_tokens=True) # Example usage prompt = "Prove that the sum of two even numbers is even." response = generate_response(prompt) print(response) ``` ### For Lean Theorem Proving ```python # Lean-specific prompt format lean_prompt = """ theorem sum_of_evens_is_even (a b : ℤ) (ha : even a) (hb : even b) : even (a + b) := by sorry """ proof = generate_response(f"Complete this Lean proof:\n{lean_prompt}") print(proof) ``` ## 🎯 Intended Use Cases - **Mathematical Reasoning**: Solving complex mathematical problems - **Lean Theorem Proving**: Generating and completing formal proofs - **Educational Support**: Assisting with mathematics education - **Research**: Supporting formal verification research ## ⚠️ Limitations - **Domain Specific**: Optimized primarily for mathematics and Lean - **LoRA Adaptation**: Requires base model for full functionality - **Computational Requirements**: Needs significant GPU memory for inference - **Training Data**: Performance limited to training data distribution ## 📈 Performance Notes This model achieved optimal performance through early stopping at step 4,200, preventing overfitting while maintaining strong performance on validation data. The relatively low validation loss (0.516) indicates good generalization on mathematical reasoning tasks. ## 🔄 Model Versions - **Current**: Step 4,200 (Best model with early stopping) - **Architecture**: DeepSeek-R1-Distill-Qwen-32B + LoRA - **Format**: PEFT-compatible LoRA adapter ## 📚 Citation If you use this model in your research, please cite: ```bibtex @misc{deepseek-r1-distill-lean-2025, title={DeepSeek-R1-Distill-Qwen-32B-for-lean}, author={Chattso-GPT}, year={2025}, howpublished={\\url{https://huggingface.co/Chattso-GPT/DeepSeek-R1-Distill-Qwen-32B-for-lean}}, } ``` ## 📄 License This model is released under the Apache 2.0 License, following the base model's licensing terms. ## 🤝 Acknowledgments - Base model: [DeepSeek AI](https://huggingface.co/deepseek-ai) for the foundation model - Training framework: Unsloth for efficient fine-tuning - Community: Hugging Face for model hosting and tools ## 📞 Contact For questions or issues regarding this model, please open an issue in the model repository or contact the model author. --- *Training completed with early stopping at optimal performance. Model ready for mathematical reasoning and Lean theorem proving tasks.*