NER-LLaMA-3.1-8B-Biomedical

Model Description

This model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct for biomedical Named Entity Recognition (NER). It was developed as part of the EvalLLM 2025 challenge (Run 3) and is specifically designed to identify 21 types of biomedical entities in French text.

The model is trained with LoRA fine-tuning on synthetic data.

Model Details

Base Model

  • Architecture: LLaMA-3.1-8B-Instruct
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Language: French
  • Domain: Biomedical and health-related text
  • Task : NER

Training Configuration

  • LoRA Rank: 16
  • Training Epochs: 5
  • Batch Size: 4 (with gradient accumulation over 8 steps)
  • Learning Rate: 2e-05
  • Scheduler: Cosine annealing
  • Training Data: 1,748 synthetic documents generated by GPT-4.1

Entity Types (21 categories)

Entity Type Description Example
ABS_DATE Absolute dates "15 mars 2020"
ABS_PERIOD Absolute periods "janvier 2019 à mars 2020"
BIO_TOXIN Biological toxins "toxine botulique"
DIS_REF_TO_PATH Disease references to pathogens "infection par E. coli"
DOC_AUTHOR Document authors "Dr. Martin Dubois"
DOC_DATE Document dates "publié le 12/03/2021"
DOC_SOURCE Document sources "Journal of Medicine"
EVENT_MACRO Large-scale events "épidémie de COVID-19"
EVENT_MICRO Small-scale events "cas de contamination"
EXPLOSIVE Explosive materials "TNT", "dynamite"
FUZZY_PERIOD Fuzzy time periods "début d'année", "récemment"
INF_DISEASE Infectious diseases "grippe", "tuberculose"
LOCATION Geographic locations "Paris", "France"
LOC_REF_TO_ORG Location references to organizations "hôpital de Lyon"
NON_INF_DISEASE Non-infectious diseases "diabète", "cancer"
ORGANIZATION Organizations "OMS", "Institut Pasteur"
ORG_REF_TO_LOC Organization references to locations "OMS Europe"
PATHOGEN Pathogens "virus Ebola", "E. coli"
PATH_REF_TO_DIS Pathogen references to diseases "virus causant la grippe"
RADIOISOTOPE Radioactive isotopes "uranium 235", "césium 137"
REL_DATE Relative dates "hier", "la semaine dernière"
REL_PERIOD Relative periods "depuis 3 mois"
TOXIC_AGENT Toxic agents "plomb", "mercure"

Performance with few-shot Learning

  • Micro F1 Score: 60.67%
  • Micro Precision: 58.14%
  • Micro Recall: 63.44%
  • Macro F1 Score: 40.91%
  • Macro Precision: 40.99%
  • Macro Recall: 42.56%

Note: Detailed performance metrics available in the evaluation results

Citation


Related Resources

Contact

For questions or issues:

Downloads last month
12
Safetensors
Model size
8.03B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ik-ram28/NER-LLama-3.1-8B

Finetuned
(1515)
this model

Dataset used to train ik-ram28/NER-LLama-3.1-8B