X-iZhang
/

libra-Llama-3.2-3B-Instruct

+---
+license: apache-2.0
+pipeline_tag: image-text-to-text
+base_model:
+- epfl-llm/meditron-7b
+- microsoft/rad-dino
+base_model_relation: merge
+library_name: transformers
+tags:
+- RRG
+- Radiology Report Generation
+- Chest X-ray
+- Multimodal Large Language Models
+---
+<br>
+# **Libra Model Card**
+**Version**: Libra-v1.0
+## Overview
+**Libra** is a multimodal Large Language Model (LLM) specialized in **radiology report generation**, particularly **chest X-ray** interpretations. It can produce detailed _Findings_ sections with **temporal comparisons** (e.g., comparing a current chest X-ray with prior ones). Libra integrates the following key components:
+- **RAD-DINO**: A vision encoder pre-trained on medical imaging datasets for robust feature extraction from chest X-rays.
+- **Meditron-7B**: A 7B-parameter large language model (based on Llama-2) specialized in medical text generation.
+- **Temporal Alignment Connector (TAC)**: A custom adapter that fuses features across multiple time points to enable temporal comparisons.
+This model card provides an overview of Libra’s architecture, training methodology, limitations, and recommended usage guidelines.
+---
+##  Paper and Resources
+For more detailed information regarding Libra’s methodology, theoretical foundation, and performance benchmarks, please refer to the following resources:
+- **Project Website**: [Libra v1.0](https://x-izhang.github.io/Libra_v1.0/)
+- **Paper**: [arXiv:2411.19378](https://arxiv.org/abs/2411.19378)
+- **Code Repository**: [X-iZhang/Libra (GitHub)](https://github.com/X-iZhang/Libra)
+Or check out our Spaces demo! [![Open in Spaces](https://huggingface.co/datasets/huggingface/badges/resolve/main/open-in-hf-spaces-md-dark.svg)](https://huggingface.co/spaces/X-iZhang/Libra)
+---
+##  Training Strategy
+Libra is trained in a **two-stage process**:
+1. **Temporal Feature Alignment**
+   - Trains TAC to effectively fuse and align features from different time points (current and previous chest X-rays).
+   - Focuses on capturing notable changes (e.g., appearance or progression of opacities, devices, and lines).
+2. **Fine-Tuning for Radiology Report Generation**
+   - The language model part is fine-tuned on a large dataset of paired chest X-ray images and radiology reports.
+   - Emphasizes the generation of the _Findings_ section, especially incorporating temporal descriptors.
+---
+##  Intended Use
+Libra is primarily designed to **assist** clinical practitioners, researchers, and medical students in generating chest X-ray reports. Key applications include:
+- **Clinical Decision Support**: Providing draft findings that can be refined by a radiologist.
+- **Educational Tool**: Demonstrating example interpretations and temporal changes for training radiology residents.
+- **Research**: Facilitating studies on automated report generation and temporal feature learning in medical imaging.
+> **Important**: Outputs should be reviewed by qualified radiologists or medical professionals before final clinical decisions are made.
+---
+##  Limitations and Recommendations
+1. **Data Bias**: The model’s performance may be less reliable for underrepresented demographics or rare pathologies.
+2. **Clinical Oversight**: Always involve a medical professional to verify the results—Libra is not a substitute for professional judgment.
+3. **Temporal Inaccuracies**: Despite TAC’s focus on temporal alignment, subtle or uncommon changes may go unrecognized.
+4. **Generalization**: Libra’s performance on chest X-ray types or conditions not seen during training may be limited.
+---
+##  Ethical Considerations
+- **Patient Privacy**: Ensure the data is fully de-identified and compliant with HIPAA/GDPR (or relevant privacy regulations).
+- **Responsible Use**: Deploy Libra’s outputs carefully; they are not guaranteed to be error-free.
+- **Accountability**: Users and organizations must assume responsibility for verifying clinical accuracy and safety.
+---
+## How to Cite ✒️
+If you use Libra in academic or research contexts, please cite:
+```bibtex
+@misc{zhang2024libraleveragingtemporalimages,
+      title={Libra: Leveraging Temporal Images for Biomedical Radiology Analysis},
+      author={Xi Zhang and Zaiqiao Meng and Jake Lever and Edmond S. L. Ho},
+      year={2024},
+      eprint={2411.19378},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2411.19378},
+}
+```
+##  Disclaimer:
+This tool is for research and educational purposes only. It is not FDA-approved or CE-marked for clinical use. Users should consult qualified healthcare professionals for any clinical decisions.