File size: 4,523 Bytes
cb73d22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ed124e2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
---
license: unknown
---
# LLaVA1.5-BiomedCLIP-Vicuna-7b for multimodal radiology report generation

This is a model based on LLaVA1.5-Vicuna-7b, finetuned to generate medical reports, based on a chest X-ray and a prompt, in our case, the instruction was "write the finding section of a chest x-ray radiology report".

The vision-encoder of the model is a [BiomedCLIP](https://huggingface.co/microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224), the conector is a 2 layer MLP and the LLM is a Vicuna-7b-1.5v

The dataset used for finetuning was the MIMIC-CXR shared for the challenge in Radiology Report Generation for the Association for Computational Linguistics 2024 at BioNLP Workshop.

We used the 148,374 findings of MIMIC-CXR for finetuning during 3 epochs.

The model metrics on the 1,063 samples of the hidden test set of the ACL challenge are the following:

| Method                        | BLEU-4 | Rouge-L | Bertscore | F1-CheXbert | F1-RadGraph | Avg   |
|-------------------------------|--------|---------|-----------|-------------|-------------|-------|
| llava1.5-biomedclip-Vicuna-7b |  3.48  |  16.31  |   35.49   |    29.37    |    15.51    | 20.03 |


When we used BiomedCLIP dfor th challenge, we saw a clear improvement in 6.31 pp for F1-cheXbert compared to the second best
model in this metric (29.37 vs 23.06).

The metrics were calculated directly by the challenge organizer, however you can reproduce them with the following 
example code:

```python
import json
import logging
from vilmedic.blocks.scorers.scores import compute_scores

refs = [
    "The lungs are clear. The cardiomediastinal silhouette is within normal limits. No acute osseous abnormalities.",
    "The lungs are clear.There is no pleural effusion or pneumothorax.The cardiomediastinal silhouette is normal."
]
hyps = [
    "The lungs are clear. There is no pleural effusion or pneumothorax. The cardiomediastinal silhouette is normal.",
    "The lungs are clear. The cardiomediastinal silhouette is within normal limits. No acute osseous abnormalities."
]
print("Computing metrics, this can take a while...")
print(json.dumps(compute_scores(["ROUGEL", "bertscore", "radgraph", "BLEU", "chexbert"],
                                refs=refs,
                                hyps=hyps,
                                split=None,
                                seed=None,
                                config=None,
                                epoch=None,
                                logger=logging.getLogger(__name__),
                                dump=False),
                 indent=4)
      )
```

More details of the challenge can be found on the [challenge web page](https://stanford-aimi.github.io/RRG24/)
or in [workshop site](https://aclweb.org/aclwiki/BioNLP_Workshop) 

# Citation
If you use our model for your research and applications, please cite using the following BibTex:

```
@inproceedings{campanini-etal-2024-ihealth,
    title = "i{H}ealth-{C}hile-1 at {RRG}24: In-context Learning and Finetuning of a Large Multimodal Model for Radiology Report Generation",
    author = "Campanini, Diego  and
      Loch, Oscar  and
      Messina, Pablo  and
      Elberg, Rafael  and
      Parra, Denis",
    editor = "Demner-Fushman, Dina  and
      Ananiadou, Sophia  and
      Miwa, Makoto  and
      Roberts, Kirk  and
      Tsujii, Junichi",
    booktitle = "Proceedings of the 23rd Workshop on Biomedical Natural Language Processing",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.bionlp-1.52",
    doi = "10.18653/v1/2024.bionlp-1.52",
    pages = "608--613"
}

@inproceedings{loch-etal-2024-ihealth,
    title = "i{H}ealth-{C}hile-3{\&}2 at {RRG}24: Template Based Report Generation",
    author = "Loch, Oscar  and
      Messina, Pablo  and
      Elberg, Rafael  and
      Campanini, Diego  and
      Soto, {\'A}lvaro  and
      Vidal, Ren{\'e}  and
      Parra, Denis",
    editor = "Demner-Fushman, Dina  and
      Ananiadou, Sophia  and
      Miwa, Makoto  and
      Roberts, Kirk  and
      Tsujii, Junichi",
    booktitle = "Proceedings of the 23rd Workshop on Biomedical Natural Language Processing",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.bionlp-1.53",
    doi = "10.18653/v1/2024.bionlp-1.53",
    pages = "614--623"
}
```