EXAONE-Path-1.5 / README.md
2ms's picture
edit README
1c17407
---
license: other
license_name: exaonepath
license_link: LICENSE
tags:
- lg-ai
- EXAONEPath-1.5
- pathology
---
# EXAONE Path 1.5
## Introduction
EXAONE Path 1.5 is a whole slide image level(WSI-level) classification framework designed for downstream tasks in pathology, such as cancer subtyping, molecular subtyping and mutation prediction. It builds upon our previous work, EXAONE Path v1.0, which focused on patch-wise feature extraction by dividing a WSI into patches and embedding each patch into a feature vector.
In EXAONE Path 1.5, we extend this pipeline to take an entire WSI as input. Each patch is first processed using the pretrained EXAONE Path 1.0 encoder to extract patch-level features. These features are then aggregated using a ViT-based (Vision Transformer) aggregator module to produce a slide-level representation.
This aggregated representation is subsequently passed through a linear classifier to perform downstream tasks such as molecular subtyping, tumor subtyping, and mutation prediction.
To effectively train the aggregator, we adopt a two-stage learning process:
Pretraining: We employ multimodal learning by aligning slide images with various mRNA gene expression profiles to learn semantically meaningful slide-level representations.
Fine-tuning: The pretrained model is then adapted to specific downstream classification tasks.
In this repository, we release the model trained for EGFR mutation prediction in lung adenocarcinoma (LUAD), enabling researchers to leverage our pipeline for similar molecular pathology applications.
## Quickstart
### 1. Hardware Requirements ###
- NVIDIA GPU is required
- Minimum 40GB GPU memory recommended
- Tested on Ubuntu 22.04 with NVIDIA driver version 550.144.03
Note: This implementation requires NVIDIA GPU and drivers. The provided environment setup specifically uses CUDA-enabled PyTorch, making NVIDIA GPU mandatory for running the model.
### 2. Environment Setup
```
pip install -r requirements.txt
```
### 3-a. Load the model & Inference
#### Load model with HuggingFace
```python
from models.exaonepath import EXAONEPathV1p5Downstream
hf_token = "YOUR_HUGGING_FACE_ACCESS_TOKEN"
model = EXAONEPathV1p5Downstream.from_pretrained("LGAI-EXAONE/EXAONE-Path-1.5", use_auth_token=hf_token)
slide_path = './samples/wsis/1/1.svs'
probs = model(slide_path)
```
#### Fast CLI Inference
Before running the command below, make sure you update your Hugging Face token.
Open `tokens.py` and replace the placeholder with your actual token:
```python
HF_TOKEN = "YOUR_HUGGING_FACE_ACCESS_TOKEN"
```
Then, run inference with:
```bash
python inference.py --svs_path ./samples/wsis/1/1.svs
```
### 3-b. Fine-tuning with Pretrained Weights
We provide example scripts and files to help you fine-tune the model on your own dataset.
The provided script fine-tunes the model using pretrained weights stored in `./pretrained_weight.pth`.
#### Extract Features from WSI Images
To train the model using WSI images and their corresponding labels,
you must first extract patch-level features from each WSI using our provided feature extractor.
```bash
python feature_extract.py --input_dir ./samples/wsis/ --output_dir ./samples/feats/
```
This will generate .pt feature files in the output_dir.
#### Fine-tuning
```bash
bash tuning_script.sh
```
Inside tuning_script.sh, you can modify the following variables to match your dataset:
```bash
FEAT_PATH=./samples/feats
LABEL_PATH=./samples/label/label.csv
LABEL_DICT="{'n':0, 'y':1}"
SPLIT_PATH=./samples/splits
```
Change these paths to point to your own feature, label, and split files to start training.
## Model Performance Comparison
| Metric: AUC | Titan(Conch v1.5+iBot, image+text) | PRISM (virchow+pe receiver, Image+text) | CHIEF (CTransPath + CLAM, Image+text, clam+wsi contrastive) | Prov-GigaPath (GigaPath+LongNet, Image-only, mask precision manner) | UNI2-h + CLAM (Image-only) | EXAONE Path 1.5(image+gene expression) |
|--------------------------|----------------------------------|-----------------------------------------|--------------------------------------------------------------|------------------------------------------------------------------------|-----------------------------|------------------|
| **TMB (cutoff 10)** | 0.74 | 0.73 | 0.70 | 0.69 | 0.71 | 0.71 |
| **LUAD-EGFR-mut** | 0.76 | 0.80 | 0.73 | 0.73 | 0.79 | 0.81 |
| **LUAD-KRAS-mut** | 0.61 | 0.65 | 0.61 | 0.66 | 0.60 | 0.63 |
| **LUAD-Gene-overexp[1]** | 0.75 | 0.68 | 0.71 | 0.71 | 0.74 | 0.72 |
| **CRC-MSS/MSI** | 0.89 | 0.88 | 0.86 | 0.90 | 0.90 | 0.89 |
| **BRCA-ER_PR_HER2** | 0.82 | 0.79 | 0.76 | 0.79 | 0.81 | 0.77 |
| **Pan-cancer-Gene-mut[2]** | 0.79 | 0.77 | 0.73 | 0.74 | 0.77 | 0.76 |
| **Avg. AUC** | 0.77 | 0.76 | 0.73 | 0.74 | 0.77 | 0.76 |
[1]: **lung-gene-overexp**: total 11 genes were evaluated: LAG3, CLDN6, CD274, EGFR, ERBB2, ERBB3, CD276, VTCN1, TACSTD2, FOLR1, MET.
[2]: **Pan-cancer-Gene-mut**: total 7 genes were evaluated: TP53, KRAS, ALK, PIK3CA, MET, EGFR, PTEN
## License
The model is licensed under [EXAONEPath AI Model License Agreement 1.0 - NC](./LICENSE)