Model Description
MedMistral-CPT-SFT-7B is a French medical language model based on Mistral-7B-v0.1, adapted for medical domain applications through a combined approach of Continual Pre-Training (CPT) followed by Supervised Fine-Tuning (SFT).
Model Details
- Model Type: Causal Language Model
- Base Model: Mistral-7B-v0.1
- Language: French
- Domain: Medical/Healthcare
- License: Apache 2.0
- Paper: Adaptation des connaissances médicales pour les grands modèles de langue : Stratégies et analyse comparative
Training Details
Continual Pre-Training (CPT)
- Dataset: NACHOS corpus (opeN crAwled frenCh Healthcare cOrpuS)
- Size: 7.4 GB of French medical texts
- Word Count: Over 1 billion words (1,088,867,950 words)
- Sources: 24 French medical websites
- Training Duration: 2.8 epochs
- Hardware: 32 NVIDIA H100 80GB GPUs
- Training Time: 12 hours
- Optimizer: AdamW
- Learning Rate: 2e-5
- Weight Decay: 0.01
- Batch Size: 16 with gradient accumulation of 2
Supervised Fine-Tuning (SFT)
- Dataset: 30K French medical question-answer pairs
- 10K native French medical questions
- 10K translated medical questions from English resources
- 10K generated questions from French medical texts
- Method: DoRA (Weight-Decomposed Low-Rank Adaptation)
- Training Duration: 10 epochs
- Hardware: 1 NVIDIA A100 80GB GPU
- Training Time: 75 hours
- Rank: 16
- Alpha: 16
- Learning Rate: 2e-5
- Batch Size: 4
Computational Impact
- Total Training Time: 87 hours (12h CPT + 75h SFT)
- Carbon Emissions: 11.78 kgCO2e (9.86 + 1.92)
Ethical Considerations
- Medical Accuracy: This model is for research and educational purposes only. All outputs should be verified by qualified medical professionals
- Bias: Training data may contain biases present in medical literature and online medical resources
Citation
If you use this model, please cite:
Contact
For questions about this model, please contact: [email protected]
- Downloads last month
- 18
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support