dcampanini
commited on
Commit
•
5ac182a
1
Parent(s):
e96d323
add model metrics to readme
Browse files
README.md
CHANGED
@@ -9,7 +9,45 @@ in our case, the instruction was "write the finding section of chest x-ray radio
|
|
9 |
The dataset used for finetuning was the MIMIC-CXR share for the challenge in Radiology Report Generation
|
10 |
for the Association for Computational Linguistics 2024 at BioNLP Workshop
|
11 |
|
12 |
-
We used the 148,374 findings of MIMIC-CXR for
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
|
14 |
More details of the challenge can be found on the [challenge web page](https://stanford-aimi.github.io/RRG24/)
|
15 |
or in [workshop site](https://aclweb.org/aclwiki/BioNLP_Workshop)
|
|
|
9 |
The dataset used for finetuning was the MIMIC-CXR share for the challenge in Radiology Report Generation
|
10 |
for the Association for Computational Linguistics 2024 at BioNLP Workshop
|
11 |
|
12 |
+
We used the 148,374 findings of MIMIC-CXR for finetuning during 3 epochs.
|
13 |
+
|
14 |
+
The model metrics on the 1,063 samples of the hidden test set of the ACL challenge are the following:
|
15 |
+
|
16 |
+
| Method | BLEU-4 | Rouge-L | Bertscore | F1-CheXbert | F1-RadGraph | Avg |
|
17 |
+
|-------------|--------|---------|-----------|-------------|-------------|-------|
|
18 |
+
| llavamed1.0 | 5.05 | 19.13 | 47.51 | 23.06 | 15.77 | 22.10 |
|
19 |
+
|
20 |
+
|
21 |
+
The metrics were calculated direcly by the challenge organizer, however you can reproduce them with the following
|
22 |
+
example code:
|
23 |
+
|
24 |
+
```python
|
25 |
+
import json
|
26 |
+
import logging
|
27 |
+
from vilmedic.blocks.scorers.scores import compute_scores
|
28 |
+
|
29 |
+
refs = [
|
30 |
+
"The lungs are clear. The cardiomediastinal silhouette is within normal limits. No acute osseous abnormalities.",
|
31 |
+
"The lungs are clear.There is no pleural effusion or pneumothorax.The cardiomediastinal silhouette is normal."
|
32 |
+
]
|
33 |
+
hyps = [
|
34 |
+
"The lungs are clear. There is no pleural effusion or pneumothorax. The cardiomediastinal silhouette is normal.",
|
35 |
+
"The lungs are clear. The cardiomediastinal silhouette is within normal limits. No acute osseous abnormalities."
|
36 |
+
]
|
37 |
+
print("Computing metrics, this can take a while...")
|
38 |
+
print(json.dumps(compute_scores(["ROUGEL", "bertscore", "radgraph", "BLEU", "chexbert"],
|
39 |
+
refs=refs,
|
40 |
+
hyps=hyps,
|
41 |
+
split=None,
|
42 |
+
seed=None,
|
43 |
+
config=None,
|
44 |
+
epoch=None,
|
45 |
+
logger=logging.getLogger(__name__),
|
46 |
+
dump=False),
|
47 |
+
indent=4)
|
48 |
+
)
|
49 |
+
```
|
50 |
+
|
51 |
|
52 |
More details of the challenge can be found on the [challenge web page](https://stanford-aimi.github.io/RRG24/)
|
53 |
or in [workshop site](https://aclweb.org/aclwiki/BioNLP_Workshop)
|