krishnareddy commited on
Commit
c0ddd3e
·
verified ·
1 Parent(s): b4775a7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -7
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- license: mit
3
  language:
4
  - en
5
  tags:
@@ -45,12 +45,40 @@ widget:
45
  - text: 'Impression: fever, chills, cough, chest pain, shortness of breath, N/V.'
46
  example_title: Example 2
47
  ---
48
- # DX Coding Training Model
49
 
50
- ## Introduction
51
- This model is focused on automatically generating DX (Diagnostic) codes using a Named Entity Recognition (NER) approach. It's trained on private data at the token level to ensure precision in identifying relevant medical entities.
 
52
 
53
  ## Model Details
54
- - **Approach**: Named Entity Recognition (NER)
55
- - **Training Data**: Private dataset, token-level annotations
56
- - **Primary Use-case**: Generating DX codes from medical texts
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
  language:
4
  - en
5
  tags:
 
45
  - text: 'Impression: fever, chills, cough, chest pain, shortness of breath, N/V.'
46
  example_title: Example 2
47
  ---
48
+ # ICD-10 DX Code Identification Model
49
 
50
+ ## Overview
51
+ This model is designed for the identification of tokens related to ICD-10 DX codes in clinical documents. We focus on a subset of approximately 4,000+ codes,
52
+ which are the most frequently used in clinical documentation. Please refer config.json file for target codes we used to train this model.
53
 
54
  ## Model Details
55
+ - **Type**: Named Entity Recognition (NER)
56
+ - **Target**: ICD-10 DX Codes
57
+ - **Code Subset**: 4,000+ most common codes
58
+
59
+ ## Dataset
60
+ The dataset comprises clinical documents annotated for ICD-10 DX codes. We ensure a balanced representation of the selected codes to prevent model bias.
61
+ the dataset is private one, used internally to trian the model.
62
+
63
+ ## Training
64
+ Due to GPU memory constraints, training is conducted in epochs with periodic evaluations to monitor performance and mitigate overfitting.
65
+
66
+ # Use a pipeline as a high-level helper
67
+
68
+ from transformers import pipeline
69
+
70
+ pipe = pipeline("token-classification", model="imperiumhf/imp_clinical_dxcode_ner_v2")
71
+
72
+ ## Evaluation
73
+ Need to update metrics
74
+
75
+ ## Limitations and Considerations
76
+ - Overfitting risk due to repeated training on the same dataset.
77
+ - The balance between model complexity and the large number of classes.
78
+ - Regular model evaluation for performance monitoring.
79
+
80
+ ## Contact
81
82
+
83
+ ## Acknowledgements
84
+ All the rights over this model is reserved for Imperium software solutions pvt ltd.