---
library_name: transformers
datasets:
- leduckhai/VietMed-Sum
language:
- vi
pipeline_tag: summarization
---
# Real-time Speech Summarization for Medical Conversations
**
Interspeech 2024 (Oral)
**
Khai Le-Duc*, Khai-Nguyen Nguyen*, Long Vo-Dang, Truong-Son Hy
*Equal contribution
## Description:
In doctor-patient conversations, identifying medically relevant information is crucial, posing the need for conversation summarization. In this work, we propose the first deployable real-time speech summarization system for real-world applications in industry, which generates a local summary after every N speech utterances within a conversation and a global summary after the end of a conversation. Our system could enhance user experience from a business standpoint, while also reducing computational costs from a technical perspective. Secondly, we present VietMed-Sum which, to our knowledge, is the first speech summarization dataset for medical conversations. Thirdly, we are the first to utilize LLM and human annotators collaboratively to create gold standard and synthetic summaries for medical conversation summarization. Finally, we present baseline results of state-of-the-art models on VietMed-Sum.
All code, data (English-translated and Vietnamese) and models are available online: [https://github.com/leduckhai/MultiMed/tree/master/VietMed-Sum](https://github.com/leduckhai/MultiMed/tree/master/VietMed-Sum)
Please cite this paper: https://arxiv.org/abs/2406.15888
@article{VietMed_Sum,
title={Real-time Speech Summarization for Medical Conversations},
author={Le-Duc, Khai and Nguyen, Khai-Nguyen and Vo-Dang, Long and Hy, Truong-Son},
journal={arXiv preprint arXiv:2406.15888},
booktitle={Interspeech 2024},
url = {https://arxiv.org/abs/2406.15888},
year={2024}
}
# Model Card for Model ID
## Model Details
### Model Description
This model summarizes medical dialogues in Vietnamese. It can work in tandem with an ASR system to provide real-time dialogue summary.
- **Developed by:** Khai-Nguyen Nguyen
- **Language(s) (NLP):** Vietnamese
- **Finetuned from model [optional]:** ViT5
## How to Get Started with the Model
Install the pre-requisite packages in Python.
```python
pip install transformers
```
Use the code below to get started with the model.
```python
from transformers import pipeline
# Initialize the pipeline with the ViT5 model, specify the device to use CUDA for GPU acceleration
pipe = pipeline("text2text-generation", model="monishsystem/medisum_vit5", device='cuda')
# Example text in Vietnamese describing a traditional medicine product
example = "Loại thuốc này chứa các thành phần đông y đặc biệt tốt cho sức khoẻ, giúp tăng cường sinh lý và bổ thận tráng dương, đặc biệt tốt cho người cao tuổi và người có bệnh lý nền"
# Generate a summary for the input text with a maximum length of 50 tokens
summary = pipe(example, max_new_tokens=50)
# Print the generated summary
print(summary)
```