SungJoo commited on
Commit
a052738
·
verified ·
1 Parent(s): b3fa244

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +91 -0
README.md ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - SungJoo/KBMC
5
+ language:
6
+ - ko
7
+ library_name: transformers
8
+ tags:
9
+ - medical
10
+ - NER
11
+ ---
12
+
13
+
14
+ # Model Card for medical-ner-koelectra
15
+
16
+ ## Model Summary
17
+
18
+ This model is a fine-tuned version of the [monologg/koelectra-base-v3-discriminator](https://huggingface.co/monologg/koelectra-base-v3-discriminator).
19
+
20
+ We fine-tuned the model with KBMC and the [Naver X Changwon Univ NER dataset](https://ko-nlp.github.io/Korpora/ko-docs/corpuslist/naver_changwon_ner.html)
21
+ ## Model Details
22
+
23
+ ### Model Description
24
+
25
+ - **Developed by:** Sungjoo Byun (Grace Byun)
26
+ - **Language(s) (NLP):** Korean
27
+ - **License:** Apache 2.0
28
+ - **Finetuned from model:** monologg/koelectra-base-v3-discriminator
29
+
30
+
31
+ ## Training Data
32
+
33
+ The model was trained using the dataset [Naver X Changwon Univ NER dataset](https://ko-nlp.github.io/Korpora/ko-docs/corpuslist/naver_changwon_ner.html) and [Korean Bio-Medical Corpus (KBMC)](https://huggingface.co/datasets/SungJoo/KBMC).
34
+
35
+ # Model Performance
36
+
37
+ ## Overall Metrics
38
+
39
+ - **F1 Score:** 0.8886
40
+ - **Loss:** 0.2949
41
+ - **Precision:** 0.8844
42
+ - **Recall:** 0.8928
43
+
44
+ ## Class-wise Performance
45
+
46
+ | Class | Precision | Recall | F1-Score | Support |
47
+ |-------------|-----------|--------|----------|---------|
48
+ | AFW | 0.6676 | 0.6326 | 0.6496 | 362 |
49
+ | ANM | 0.7476 | 0.7800 | 0.7635 | 600 |
50
+ | Body | 0.9731 | 0.9813 | 0.9772 | 1068 |
51
+ | CVL | 0.8492 | 0.8579 | 0.8536 | 4977 |
52
+ | DAT | 0.9078 | 0.9286 | 0.9181 | 2130 |
53
+ | Disease | 0.9738 | 0.9872 | 0.9805 | 2109 |
54
+ | EVT | 0.7332 | 0.7446 | 0.7389 | 1026 |
55
+ | FLD | 0.6138 | 0.6170 | 0.6154 | 188 |
56
+ | LOC | 0.8721 | 0.8691 | 0.8706 | 1734 |
57
+ | MAT | 0.5385 | 0.5000 | 0.5185 | 14 |
58
+ | NUM | 0.9227 | 0.9305 | 0.9266 | 4660 |
59
+ | ORG | 0.8917 | 0.8866 | 0.8892 | 3307 |
60
+ | PER | 0.8918 | 0.9049 | 0.8983 | 3626 |
61
+ | PLT | 0.2941 | 0.2174 | 0.2500 | 23 |
62
+ | TIM | 0.8644 | 0.9173 | 0.8901 | 278 |
63
+ | Treatment | 0.9468 | 0.9852 | 0.9656 | 271 |
64
+
65
+ ## Averages
66
+
67
+ | Metric | Micro Avg | Macro Avg | Weighted Avg |
68
+ |----------------|-----------|-----------|--------------|
69
+ | Precision | 0.8844 | 0.7930 | 0.8841 |
70
+ | Recall | 0.8928 | 0.7963 | 0.8928 |
71
+ | F1-Score | 0.8886 | 0.7941 | 0.8884 |
72
+
73
+
74
+ ## Citations
75
+
76
+ Please cite our KBMC paper:
77
+
78
+ ```bibtex
79
+ @misc{byun2024korean,
80
+ title={Korean Bio-Medical Corpus (KBMC) for Medical Named Entity Recognition},
81
+ author={Sungjoo Byun and Jiseung Hong and Sumin Park and Dongjun Jang and Jean Seo and Minseok Kim and Chaeyoung Oh and Hyopil Shin},
82
+ year={2024},
83
+ eprint={2403.16158},
84
+ archivePrefix={arXiv},
85
+ primaryClass={cs.CL}
86
+ }
87
+ ```
88
+
89
+ ## Model Card Contact
90
+
91
+ For any questions or issues, please contact [email protected].