|
--- |
|
license: unknown |
|
--- |
|
# LLaVA1.5-BiomedCLIP-Vicuna-7b for multimodal radiology report generation |
|
|
|
This is a model based on LLaVA1.5-Vicuna-7b, finetuned to generate medical reports, based on a chest X-ray and a prompt, in our case, the instruction was "write the finding section of a chest x-ray radiology report". |
|
|
|
The vision-encoder of the model is a [BiomedCLIP](https://huggingface.co/microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224), the conector is a 2 layer MLP and the LLM is a Vicuna-7b-1.5v |
|
|
|
The dataset used for finetuning was the MIMIC-CXR shared for the challenge in Radiology Report Generation for the Association for Computational Linguistics 2024 at BioNLP Workshop. |
|
|
|
We used the 148,374 findings of MIMIC-CXR for finetuning during 3 epochs. |
|
|
|
The model metrics on the 1,063 samples of the hidden test set of the ACL challenge are the following: |
|
|
|
| Method | BLEU-4 | Rouge-L | Bertscore | F1-CheXbert | F1-RadGraph | Avg | |
|
|-------------------------------|--------|---------|-----------|-------------|-------------|-------| |
|
| llava1.5-biomedclip-Vicuna-7b | 3.48 | 16.31 | 35.49 | 29.37 | 15.51 | 20.03 | |
|
|
|
|
|
When we used BiomedCLIP dfor th challenge, we saw a clear improvement in 6.31 pp for F1-cheXbert compared to the second best |
|
model in this metric (29.37 vs 23.06). |
|
|
|
The metrics were calculated directly by the challenge organizer, however you can reproduce them with the following |
|
example code: |
|
|
|
```python |
|
import json |
|
import logging |
|
from vilmedic.blocks.scorers.scores import compute_scores |
|
|
|
refs = [ |
|
"The lungs are clear. The cardiomediastinal silhouette is within normal limits. No acute osseous abnormalities.", |
|
"The lungs are clear.There is no pleural effusion or pneumothorax.The cardiomediastinal silhouette is normal." |
|
] |
|
hyps = [ |
|
"The lungs are clear. There is no pleural effusion or pneumothorax. The cardiomediastinal silhouette is normal.", |
|
"The lungs are clear. The cardiomediastinal silhouette is within normal limits. No acute osseous abnormalities." |
|
] |
|
print("Computing metrics, this can take a while...") |
|
print(json.dumps(compute_scores(["ROUGEL", "bertscore", "radgraph", "BLEU", "chexbert"], |
|
refs=refs, |
|
hyps=hyps, |
|
split=None, |
|
seed=None, |
|
config=None, |
|
epoch=None, |
|
logger=logging.getLogger(__name__), |
|
dump=False), |
|
indent=4) |
|
) |
|
``` |
|
|
|
More details of the challenge can be found on the [challenge web page](https://stanford-aimi.github.io/RRG24/) |
|
or in [workshop site](https://aclweb.org/aclwiki/BioNLP_Workshop) |
|
|
|
# Citation |
|
If you use our model for your research and applications, please cite using the following BibTex: |
|
|
|
``` |
|
@inproceedings{campanini-etal-2024-ihealth, |
|
title = "i{H}ealth-{C}hile-1 at {RRG}24: In-context Learning and Finetuning of a Large Multimodal Model for Radiology Report Generation", |
|
author = "Campanini, Diego and |
|
Loch, Oscar and |
|
Messina, Pablo and |
|
Elberg, Rafael and |
|
Parra, Denis", |
|
editor = "Demner-Fushman, Dina and |
|
Ananiadou, Sophia and |
|
Miwa, Makoto and |
|
Roberts, Kirk and |
|
Tsujii, Junichi", |
|
booktitle = "Proceedings of the 23rd Workshop on Biomedical Natural Language Processing", |
|
month = aug, |
|
year = "2024", |
|
address = "Bangkok, Thailand", |
|
publisher = "Association for Computational Linguistics", |
|
url = "https://aclanthology.org/2024.bionlp-1.52", |
|
doi = "10.18653/v1/2024.bionlp-1.52", |
|
pages = "608--613" |
|
} |
|
|
|
@inproceedings{loch-etal-2024-ihealth, |
|
title = "i{H}ealth-{C}hile-3{\&}2 at {RRG}24: Template Based Report Generation", |
|
author = "Loch, Oscar and |
|
Messina, Pablo and |
|
Elberg, Rafael and |
|
Campanini, Diego and |
|
Soto, {\'A}lvaro and |
|
Vidal, Ren{\'e} and |
|
Parra, Denis", |
|
editor = "Demner-Fushman, Dina and |
|
Ananiadou, Sophia and |
|
Miwa, Makoto and |
|
Roberts, Kirk and |
|
Tsujii, Junichi", |
|
booktitle = "Proceedings of the 23rd Workshop on Biomedical Natural Language Processing", |
|
month = aug, |
|
year = "2024", |
|
address = "Bangkok, Thailand", |
|
publisher = "Association for Computational Linguistics", |
|
url = "https://aclanthology.org/2024.bionlp-1.53", |
|
doi = "10.18653/v1/2024.bionlp-1.53", |
|
pages = "614--623" |
|
} |
|
``` |