---
license: apache-2.0
language:
- en
tags:
- Token Classification
co2_eq_emissions: 0.0279399890043426
widget:
- text: ""MSH|^~&|SendingAPP|MYTEST|||20230621090000||ORU^R01|1|P|2.5.1||||||UNICODE
PID|1||13579246^^^TEST||Taylor^Michael||19830520|M|||987 Pine St^^Anytown^NY^23456||555-456-7890
PV1|1||bc^^004
OBR|1||13579246|BCD^LEFT Breast Cancer Diagnosis^99MRC||20230621090000|||Taylor^Sarah||20230620090000|||N
OBX|1|ST|FINDINGS^Findings^99MRC||Lab report shows asymmetric density in the right breast.|F|||R
OBX|2|ST|IMPRESSION^Impression^99MRC||BIRADS category: 4 - Probably left side as issues.|F|||R
OBX|3|ST|RECOMMENDATION^Recommendation^99MRC||Follow-up specialit visit  in six months.|F|||R""
  example_title: "example 1"
- text: "MSH|^~&|SendingAPP|MYTEST|||20230621090000||ORU^R01|1|P|2.5.1||||||UNICODE
PID|1||13579246^^^TEST||Taylor^Michael||19830520|M|||987 Pine St^^Anytown^NY^23456||555-456-7890
PV1|1||bc^^004
OBR|1||13579246|BCD^LEFT Breast Cancer Diagnosis^99MRC||20230621090000|||Taylor^Sarah||20230620090000|||N
OBX|1|ST|FINDINGS^Findings^99MRC||Lab report shows asymmetric density in the right breast.|F|||R
OBX|2|ST|IMPRESSION^Impression^99MRC||BIRADS category: 4 - Probably left side as issues.|F|||R
OBX|3|ST|RECOMMENDATION^Recommendation^99MRC||Follow-up specialit visit  in six months.|F|||R"
 

## About the Model
An English Named Entity Recognition model, trained on Maccrobat to recognize the bio-medical entities (107 entities) from a given text corpus (case reports etc.). This model was built on top of distilbert-base-uncased

- Dataset: Maccrobat https://figshare.com/articles/dataset/MACCROBAT2018/9764942
- Carbon emission: 0.0279399890043426 Kg
- Training time: 30.16527 minutes
- GPU used : 1 x GeForce RTX 3060 Laptop GPU

Checkout the tutorial video for explanation of this model and corresponding python library: https://youtu.be/xpiDPdBpS18

## Usage
The easiest way is to load the inference api from huggingface and second method is through the pipeline object offered by transformers library.
```python
from transformers import pipeline
from transformers import AutoTokenizer, AutoModelForTokenClassification

tokenizer = AutoTokenizer.from_pretrained("d4data/biomedical-ner-all")
model = AutoModelForTokenClassification.from_pretrained("d4data/biomedical-ner-all")

pipe = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple") # pass device=0 if using gpu
pipe("""The patient reported no recurrence of palpitations at follow-up 6 months after the ablation.""")
```

## Author