imperiumhf
/

imp_clinical_dxcode_ner_v2

Token Classification

Model card Files Files and versions

krishnareddy commited on Jan 29, 2024

Commit

c0ddd3e

·

verified ·

1 Parent(s): b4775a7

Update README.md

Files changed (1) hide show

README.md +35 -7

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-license: mit
 language:
 - en
 tags:
@@ -45,12 +45,40 @@ widget:
 - text: 'Impression: fever, chills, cough, chest pain, shortness of breath, N/V.'
   example_title: Example 2
 ---
-# DX Coding Training Model
-## Introduction
-This model is focused on automatically generating DX (Diagnostic) codes using a Named Entity Recognition (NER) approach. It's trained on private data at the token level to ensure precision in identifying relevant medical entities.
 ## Model Details
-- **Approach**: Named Entity Recognition (NER)
-- **Training Data**: Private dataset, token-level annotations
-- **Primary Use-case**: Generating DX codes from medical texts

 ---
+license: apache-2.0
 language:
 - en
 tags:
 - text: 'Impression: fever, chills, cough, chest pain, shortness of breath, N/V.'
   example_title: Example 2
 ---
+# ICD-10 DX Code Identification Model
+## Overview
+This model is designed for the identification of tokens related to ICD-10 DX codes in clinical documents. We focus on a subset of approximately 4,000+ codes,
+which are the most frequently used in clinical documentation. Please refer config.json file for target codes we used to train this model.
 ## Model Details
+- **Type**: Named Entity Recognition (NER)
+- **Target**: ICD-10 DX Codes
+- **Code Subset**: 4,000+ most common codes
+## Dataset
+The dataset comprises clinical documents annotated for ICD-10 DX codes. We ensure a balanced representation of the selected codes to prevent model bias.
+the dataset is private one, used internally to trian the model.
+## Training
+Due to GPU memory constraints, training is conducted in epochs with periodic evaluations to monitor performance and mitigate overfitting.
+# Use a pipeline as a high-level helper
+from transformers import pipeline
+pipe = pipeline("token-classification", model="imperiumhf/imp_clinical_dxcode_ner_v2")
+## Evaluation
+Need to update metrics
+## Limitations and Considerations
+- Overfitting risk due to repeated training on the same dataset.
+- The balance between model complexity and the large number of classes.
+- Regular model evaluation for performance monitoring.
+## Contact
+[email protected]
+## Acknowledgements
+All the rights over this model is reserved for Imperium software solutions pvt ltd.