Model Details

Model Description

This model is used for sentence segmentation of MIMIC-III notes. It takes the clinical text as input and predict BIO tagging, where B indicates the Beginning of a sentence, I represents Inside of a sentence, and O denotes Outside of a sentence. More details of this model is in the paper Automatic sentence segmentation of clinical record narratives in real-world data. The smaple code of using this model is at github

Out segmentation model is based on microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext, and we trained on MIMIC-III notes for a sequence labeling (token classification) task.

Model type: token classification model
Language(s) (NLP): en
Parent Model: microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext
Resources for more information: More information needed GitHub Repo

Citation

Dongfang Xu, Davy Weissenbacher, Karen O’Connor, Siddharth Rawal, and Graciela Gonzalez Hernandez. 2024. Automatic sentence segmentation of clinical record narratives in real-world data. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 20780–20793, Miami, Florida, USA. Association for Computational Linguistics.

dongfangxu
/

SentenceSegmenter-MIMIC

Model Details

Model Description

Citation

Model tree for dongfangxu/SentenceSegmenter-MIMIC