SungJoo
/

medical-ner-koelectra

Feature Extraction

Model card Files Files and versions Community

SungJoo commited on Jun 10, 2024

Commit

a052738

·

verified ·

1 Parent(s): b3fa244

Create README.md

Files changed (1) hide show

README.md +91 -0

README.md ADDED Viewed

	@@ -0,0 +1,91 @@

+---
+license: apache-2.0
+datasets:
+- SungJoo/KBMC
+language:
+- ko
+library_name: transformers
+tags:
+- medical
+- NER
+---
+# Model Card for medical-ner-koelectra
+## Model Summary
+This model is a fine-tuned version of the [monologg/koelectra-base-v3-discriminator](https://huggingface.co/monologg/koelectra-base-v3-discriminator).
+We fine-tuned the model with KBMC and the [Naver X Changwon Univ NER dataset](https://ko-nlp.github.io/Korpora/ko-docs/corpuslist/naver_changwon_ner.html)
+## Model Details
+### Model Description
+- **Developed by:** Sungjoo Byun (Grace Byun)
+- **Language(s) (NLP):** Korean
+- **License:** Apache 2.0
+- **Finetuned from model:** monologg/koelectra-base-v3-discriminator
+## Training Data
+The model was trained using the dataset [Naver X Changwon Univ NER dataset](https://ko-nlp.github.io/Korpora/ko-docs/corpuslist/naver_changwon_ner.html) and [Korean Bio-Medical Corpus (KBMC)](https://huggingface.co/datasets/SungJoo/KBMC).
+# Model Performance
+## Overall Metrics
+- **F1 Score:** 0.8886
+- **Loss:** 0.2949
+- **Precision:** 0.8844
+- **Recall:** 0.8928
+## Class-wise Performance
+| Class       | Precision | Recall | F1-Score | Support |
+|-------------|-----------|--------|----------|---------|
+| AFW         | 0.6676    | 0.6326 | 0.6496   | 362     |
+| ANM         | 0.7476    | 0.7800 | 0.7635   | 600     |
+| Body        | 0.9731    | 0.9813 | 0.9772   | 1068    |
+| CVL         | 0.8492    | 0.8579 | 0.8536   | 4977    |
+| DAT         | 0.9078    | 0.9286 | 0.9181   | 2130    |
+| Disease     | 0.9738    | 0.9872 | 0.9805   | 2109    |
+| EVT         | 0.7332    | 0.7446 | 0.7389   | 1026    |
+| FLD         | 0.6138    | 0.6170 | 0.6154   | 188     |
+| LOC         | 0.8721    | 0.8691 | 0.8706   | 1734    |
+| MAT         | 0.5385    | 0.5000 | 0.5185   | 14      |
+| NUM         | 0.9227    | 0.9305 | 0.9266   | 4660    |
+| ORG         | 0.8917    | 0.8866 | 0.8892   | 3307    |
+| PER         | 0.8918    | 0.9049 | 0.8983   | 3626    |
+| PLT         | 0.2941    | 0.2174 | 0.2500   | 23      |
+| TIM         | 0.8644    | 0.9173 | 0.8901   | 278     |
+| Treatment   | 0.9468    | 0.9852 | 0.9656   | 271     |
+## Averages
+| Metric         | Micro Avg | Macro Avg | Weighted Avg |
+|----------------|-----------|-----------|--------------|
+| Precision      | 0.8844    | 0.7930    | 0.8841       |
+| Recall         | 0.8928    | 0.7963    | 0.8928       |
+| F1-Score       | 0.8886    | 0.7941    | 0.8884       |
+## Citations
+Please cite our KBMC paper:
+```bibtex
+@misc{byun2024korean,
+      title={Korean Bio-Medical Corpus (KBMC) for Medical Named Entity Recognition},
+      author={Sungjoo Byun and Jiseung Hong and Sumin Park and Dongjun Jang and Jean Seo and Minseok Kim and Chaeyoung Oh and Hyopil Shin},
+      year={2024},
+      eprint={2403.16158},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```
+## Model Card Contact
+For any questions or issues, please contact [email protected].