Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,91 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
datasets:
|
4 |
+
- SungJoo/KBMC
|
5 |
+
language:
|
6 |
+
- ko
|
7 |
+
library_name: transformers
|
8 |
+
tags:
|
9 |
+
- medical
|
10 |
+
- NER
|
11 |
+
---
|
12 |
+
|
13 |
+
|
14 |
+
# Model Card for medical-ner-koelectra
|
15 |
+
|
16 |
+
## Model Summary
|
17 |
+
|
18 |
+
This model is a fine-tuned version of the [monologg/koelectra-base-v3-discriminator](https://huggingface.co/monologg/koelectra-base-v3-discriminator).
|
19 |
+
|
20 |
+
We fine-tuned the model with KBMC and the [Naver X Changwon Univ NER dataset](https://ko-nlp.github.io/Korpora/ko-docs/corpuslist/naver_changwon_ner.html)
|
21 |
+
## Model Details
|
22 |
+
|
23 |
+
### Model Description
|
24 |
+
|
25 |
+
- **Developed by:** Sungjoo Byun (Grace Byun)
|
26 |
+
- **Language(s) (NLP):** Korean
|
27 |
+
- **License:** Apache 2.0
|
28 |
+
- **Finetuned from model:** monologg/koelectra-base-v3-discriminator
|
29 |
+
|
30 |
+
|
31 |
+
## Training Data
|
32 |
+
|
33 |
+
The model was trained using the dataset [Naver X Changwon Univ NER dataset](https://ko-nlp.github.io/Korpora/ko-docs/corpuslist/naver_changwon_ner.html) and [Korean Bio-Medical Corpus (KBMC)](https://huggingface.co/datasets/SungJoo/KBMC).
|
34 |
+
|
35 |
+
# Model Performance
|
36 |
+
|
37 |
+
## Overall Metrics
|
38 |
+
|
39 |
+
- **F1 Score:** 0.8886
|
40 |
+
- **Loss:** 0.2949
|
41 |
+
- **Precision:** 0.8844
|
42 |
+
- **Recall:** 0.8928
|
43 |
+
|
44 |
+
## Class-wise Performance
|
45 |
+
|
46 |
+
| Class | Precision | Recall | F1-Score | Support |
|
47 |
+
|-------------|-----------|--------|----------|---------|
|
48 |
+
| AFW | 0.6676 | 0.6326 | 0.6496 | 362 |
|
49 |
+
| ANM | 0.7476 | 0.7800 | 0.7635 | 600 |
|
50 |
+
| Body | 0.9731 | 0.9813 | 0.9772 | 1068 |
|
51 |
+
| CVL | 0.8492 | 0.8579 | 0.8536 | 4977 |
|
52 |
+
| DAT | 0.9078 | 0.9286 | 0.9181 | 2130 |
|
53 |
+
| Disease | 0.9738 | 0.9872 | 0.9805 | 2109 |
|
54 |
+
| EVT | 0.7332 | 0.7446 | 0.7389 | 1026 |
|
55 |
+
| FLD | 0.6138 | 0.6170 | 0.6154 | 188 |
|
56 |
+
| LOC | 0.8721 | 0.8691 | 0.8706 | 1734 |
|
57 |
+
| MAT | 0.5385 | 0.5000 | 0.5185 | 14 |
|
58 |
+
| NUM | 0.9227 | 0.9305 | 0.9266 | 4660 |
|
59 |
+
| ORG | 0.8917 | 0.8866 | 0.8892 | 3307 |
|
60 |
+
| PER | 0.8918 | 0.9049 | 0.8983 | 3626 |
|
61 |
+
| PLT | 0.2941 | 0.2174 | 0.2500 | 23 |
|
62 |
+
| TIM | 0.8644 | 0.9173 | 0.8901 | 278 |
|
63 |
+
| Treatment | 0.9468 | 0.9852 | 0.9656 | 271 |
|
64 |
+
|
65 |
+
## Averages
|
66 |
+
|
67 |
+
| Metric | Micro Avg | Macro Avg | Weighted Avg |
|
68 |
+
|----------------|-----------|-----------|--------------|
|
69 |
+
| Precision | 0.8844 | 0.7930 | 0.8841 |
|
70 |
+
| Recall | 0.8928 | 0.7963 | 0.8928 |
|
71 |
+
| F1-Score | 0.8886 | 0.7941 | 0.8884 |
|
72 |
+
|
73 |
+
|
74 |
+
## Citations
|
75 |
+
|
76 |
+
Please cite our KBMC paper:
|
77 |
+
|
78 |
+
```bibtex
|
79 |
+
@misc{byun2024korean,
|
80 |
+
title={Korean Bio-Medical Corpus (KBMC) for Medical Named Entity Recognition},
|
81 |
+
author={Sungjoo Byun and Jiseung Hong and Sumin Park and Dongjun Jang and Jean Seo and Minseok Kim and Chaeyoung Oh and Hyopil Shin},
|
82 |
+
year={2024},
|
83 |
+
eprint={2403.16158},
|
84 |
+
archivePrefix={arXiv},
|
85 |
+
primaryClass={cs.CL}
|
86 |
+
}
|
87 |
+
```
|
88 |
+
|
89 |
+
## Model Card Contact
|
90 |
+
|
91 |
+
For any questions or issues, please contact [email protected].
|