--- library_name: transformers tags: - medical license: apache-2.0 language: - fr - en base_model: - ik-ram28/BioMistral-CPT-7B - BioMistral/BioMistral-7B --- ## Model Description BioMistral-CPT-SFT-7B is a French medical language model based on BioMistral-7B, adapted for French medical domain applications through a combined approach of Continual Pre-Training (CPT) followed by Supervised Fine-Tuning (SFT). ## Model Details - **Model Type**: Causal Language Model - **Base Model**: BioMistral-7B - **Language**: French (adapted from English medical model) - **Domain**: Medical/Healthcare - **Parameters**: 7 billion - **License**: Apache 2.0 - **Paper**: [Adaptation des connaissances médicales pour les grands modèles de langue : Stratégies et analyse comparative](https://github.com/ikram28/medllm-strategies) ## Training Details ### Continual Pre-Training (CPT) - **Dataset**: NACHOS corpus (opeN crAwled frenCh Healthcare cOrpuS) - **Size**: 7.4 GB of French medical texts - **Word Count**: Over 1 billion words - **Sources**: 24 French medical websites - **Training Duration**: 2.8 epochs - **Hardware**: 32 NVIDIA H100 80GB GPUs - **Training Time**: 11 hours - **Optimizer**: AdamW - **Learning Rate**: 2e-5 - **Weight Decay**: 0.01 - **Batch Size**: 16 with gradient accumulation of 2 ### Supervised Fine-Tuning (SFT) - **Dataset**: 30K French medical question-answer pairs - 10K native French medical questions - 10K translated medical questions from English resources - 10K generated questions from French medical texts - **Method**: DoRA (Weight-Decomposed Low-Rank Adaptation) - **Training Duration**: 10 epochs - **Hardware**: 1 NVIDIA H100 80GB GPU - **Training Time**: 42 hours - **Rank**: 16 - **Alpha**: 16 - **Learning Rate**: 2e-5 - **Batch Size**: 4 ## Computational Impact - **Total Training Time**: 53 hours (11h CPT + 42h SFT) - **Hardware**: 32 GPU H100 + 1 GPU H100 - **Carbon Emissions**: 10.11 kgCO2e (9.04 + 1.07) ## Ethical Considerations - **Medical Accuracy**: This model is for research and educational purposes only. Performance limitations make it unsuitable for critical medical applications - **Bias**: May contain biases from both English and French medical literature ## Citation If you use this model, please cite: ```bibtex ``` ## Contact For questions about this model, please contact: ikram.belmadani@lis-lab.fr