prajwal967
commited on
Commit
·
0d64fa1
1
Parent(s):
c7998b4
add brackets
Browse files
README.md
CHANGED
@@ -22,9 +22,9 @@ license: mit
|
|
22 |
|
23 |
# Model Description
|
24 |
|
25 |
-
* A RoBERTa [Liu et al., 2019](https://arxiv.org/pdf/1907.11692.pdf) model fine-tuned for de-identification of medical notes.
|
26 |
* Sequence Labeling (token classification): The model was trained to predict protected health information (PHI/PII) entities (spans). A list of protected health information categories is given by [HIPAA](https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html).
|
27 |
-
* A token can either be classified as non-PHI or as one of the 11 PHI types. Token predictions
|
28 |
* The PHI labels that were used for training and other details can be found here: [Annotation Guidelines](https://github.com/obi-ml-public/ehr_deidentification/blob/master/AnnotationGuidelines.md)
|
29 |
* More details on how to use this model, the format of data and other useful information is present in the GitHub repo: [Robust DeID](https://github.com/obi-ml-public/ehr_deidentification).
|
30 |
|
@@ -41,7 +41,7 @@ license: mit
|
|
41 |
|
42 |
# Dataset
|
43 |
|
44 |
-
* The I2B2 2014 [Stubbs and Uzuner, 2015](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4978170/) dataset was used to train this model.
|
45 |
|
46 |
| | I2B2 | | I2B2 | |
|
47 |
| --------- | --------------------- | ---------- | -------------------- | ---------- |
|
@@ -81,3 +81,7 @@ license: mit
|
|
81 |
|
82 |
|
83 |
## Results
|
|
|
|
|
|
|
|
|
|
22 |
|
23 |
# Model Description
|
24 |
|
25 |
+
* A RoBERTa [[Liu et al., 2019]](https://arxiv.org/pdf/1907.11692.pdf) model fine-tuned for de-identification of medical notes.
|
26 |
* Sequence Labeling (token classification): The model was trained to predict protected health information (PHI/PII) entities (spans). A list of protected health information categories is given by [HIPAA](https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html).
|
27 |
+
* A token can either be classified as non-PHI or as one of the 11 PHI types. Token predictions are aggregated to spans by making use of BILOU tagging.
|
28 |
* The PHI labels that were used for training and other details can be found here: [Annotation Guidelines](https://github.com/obi-ml-public/ehr_deidentification/blob/master/AnnotationGuidelines.md)
|
29 |
* More details on how to use this model, the format of data and other useful information is present in the GitHub repo: [Robust DeID](https://github.com/obi-ml-public/ehr_deidentification).
|
30 |
|
|
|
41 |
|
42 |
# Dataset
|
43 |
|
44 |
+
* The I2B2 2014 [[Stubbs and Uzuner, 2015]](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4978170/) dataset was used to train this model.
|
45 |
|
46 |
| | I2B2 | | I2B2 | |
|
47 |
| --------- | --------------------- | ---------- | -------------------- | ---------- |
|
|
|
81 |
|
82 |
|
83 |
## Results
|
84 |
+
|
85 |
+
# Questions?
|
86 |
+
|
87 |
+
Post a Github issue on the repo: [Robust DeID](https://github.com/obi-ml-public/ehr_deidentification).
|