NER-LLaMA-3.1-8B-Biomedical
Model Description
This model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct for biomedical Named Entity Recognition (NER). It was developed as part of the EvalLLM 2025 challenge (Run 3) and is specifically designed to identify 21 types of biomedical entities in French text.
The model is trained with LoRA fine-tuning on synthetic data.
Model Details
Base Model
- Architecture: LLaMA-3.1-8B-Instruct
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Language: French
- Domain: Biomedical and health-related text
- Task : NER
Training Configuration
- LoRA Rank: 16
- Training Epochs: 5
- Batch Size: 4 (with gradient accumulation over 8 steps)
- Learning Rate: 2e-05
- Scheduler: Cosine annealing
- Training Data: 1,748 synthetic documents generated by GPT-4.1
Entity Types (21 categories)
Entity Type | Description | Example |
---|---|---|
ABS_DATE |
Absolute dates | "15 mars 2020" |
ABS_PERIOD |
Absolute periods | "janvier 2019 à mars 2020" |
BIO_TOXIN |
Biological toxins | "toxine botulique" |
DIS_REF_TO_PATH |
Disease references to pathogens | "infection par E. coli" |
DOC_AUTHOR |
Document authors | "Dr. Martin Dubois" |
DOC_DATE |
Document dates | "publié le 12/03/2021" |
DOC_SOURCE |
Document sources | "Journal of Medicine" |
EVENT_MACRO |
Large-scale events | "épidémie de COVID-19" |
EVENT_MICRO |
Small-scale events | "cas de contamination" |
EXPLOSIVE |
Explosive materials | "TNT", "dynamite" |
FUZZY_PERIOD |
Fuzzy time periods | "début d'année", "récemment" |
INF_DISEASE |
Infectious diseases | "grippe", "tuberculose" |
LOCATION |
Geographic locations | "Paris", "France" |
LOC_REF_TO_ORG |
Location references to organizations | "hôpital de Lyon" |
NON_INF_DISEASE |
Non-infectious diseases | "diabète", "cancer" |
ORGANIZATION |
Organizations | "OMS", "Institut Pasteur" |
ORG_REF_TO_LOC |
Organization references to locations | "OMS Europe" |
PATHOGEN |
Pathogens | "virus Ebola", "E. coli" |
PATH_REF_TO_DIS |
Pathogen references to diseases | "virus causant la grippe" |
RADIOISOTOPE |
Radioactive isotopes | "uranium 235", "césium 137" |
REL_DATE |
Relative dates | "hier", "la semaine dernière" |
REL_PERIOD |
Relative periods | "depuis 3 mois" |
TOXIC_AGENT |
Toxic agents | "plomb", "mercure" |
Performance with few-shot Learning
- Micro F1 Score: 60.67%
- Micro Precision: 58.14%
- Micro Recall: 63.44%
- Macro F1 Score: 40.91%
- Macro Precision: 40.99%
- Macro Recall: 42.56%
Note: Detailed performance metrics available in the evaluation results
Citation
Related Resources
- GitHub Repository: EvalLLM2025
- Training Dataset: Synthetic Biomedical NER
- Base Model: LLaMA-3.1-8B-Instruct
- Paper: [Link to paper when published]
Contact
For questions or issues:
- GitHub Issues: EvalLLM2025 Issues
- Downloads last month
- 12
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for ik-ram28/NER-LLama-3.1-8B
Base model
meta-llama/Llama-3.1-8B
Finetuned
meta-llama/Llama-3.1-8B-Instruct