ik-ram28's picture
Update README.md
1513e80 verified
---
library_name: transformers
tags:
- medical
license: apache-2.0
language:
- fr
- en
base_model:
- ik-ram28/BioMistral-CPT-7B
- BioMistral/BioMistral-7B
---
## Model Description
BioMistral-CPT-SFT-7B is a French medical language model based on BioMistral-7B, adapted for French medical domain applications through a combined approach of Continual Pre-Training (CPT) followed by Supervised Fine-Tuning (SFT).
## Model Details
- **Model Type**: Causal Language Model
- **Base Model**: BioMistral-7B
- **Language**: French (adapted from English medical model)
- **Domain**: Medical/Healthcare
- **Parameters**: 7 billion
- **License**: Apache 2.0
- **Paper**: [Adaptation des connaissances médicales pour les grands modèles de langue : Stratégies et analyse comparative](https://github.com/ikram28/medllm-strategies)
## Training Details
### Continual Pre-Training (CPT)
- **Dataset**: NACHOS corpus (opeN crAwled frenCh Healthcare cOrpuS)
- **Size**: 7.4 GB of French medical texts
- **Word Count**: Over 1 billion words
- **Sources**: 24 French medical websites
- **Training Duration**: 2.8 epochs
- **Hardware**: 32 NVIDIA H100 80GB GPUs
- **Training Time**: 11 hours
- **Optimizer**: AdamW
- **Learning Rate**: 2e-5
- **Weight Decay**: 0.01
- **Batch Size**: 16 with gradient accumulation of 2
### Supervised Fine-Tuning (SFT)
- **Dataset**: 30K French medical question-answer pairs
- 10K native French medical questions
- 10K translated medical questions from English resources
- 10K generated questions from French medical texts
- **Method**: DoRA (Weight-Decomposed Low-Rank Adaptation)
- **Training Duration**: 10 epochs
- **Hardware**: 1 NVIDIA H100 80GB GPU
- **Training Time**: 42 hours
- **Rank**: 16
- **Alpha**: 16
- **Learning Rate**: 2e-5
- **Batch Size**: 4
## Computational Impact
- **Total Training Time**: 53 hours (11h CPT + 42h SFT)
- **Hardware**: 32 GPU H100 + 1 GPU H100
- **Carbon Emissions**: 10.11 kgCO2e (9.04 + 1.07)
## Ethical Considerations
- **Medical Accuracy**: This model is for research and educational purposes only. Performance limitations make it unsuitable for critical medical applications
- **Bias**: May contain biases from both English and French medical literature
## Citation
If you use this model, please cite:
```bibtex
```
## Contact
For questions about this model, please contact: [email protected]