Model Details

Model Description

This model is used for sentence segmentation of MIMIC-III notes. It takes the clinical text as input and predict BIO tagging, where B indicates the Beginning of a sentence, I represents Inside of a sentence, and O denotes Outside of a sentence. More details of this model is in the paper Automatic sentence segmentation of clinical record narratives in real-world data. The smaple code of using this model is at github

Out segmentation model is based on microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext, and we trained on MIMIC-III notes for a sequence labeling (token classification) task.

Citation

Dongfang Xu, Davy Weissenbacher, Karen O’Connor, Siddharth Rawal, and Graciela Gonzalez Hernandez. 2024. Automatic sentence segmentation of clinical record narratives in real-world data. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 20780–20793, Miami, Florida, USA. Association for Computational Linguistics.

Downloads last month
14,962
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dongfangxu/SentenceSegmenter-MIMIC