Introduction
EXAONE Path MSI is an enhanced whole-slide image (WSI) classification framework that retains the core architecture of EXAONE Path while upgrading its internals for greater efficiency and richer multimodal integration.
The pipeline still unfolds in two stages:
- Patch-wise feature extraction – Each WSI is tiled into 256 × 256 px patches, which are embedded into 768-dimensional vectors using the frozen EXAONE Path encoder.
- Slide-level aggregation – The patch embeddings are aggregated using a Vision Transformer, producing a unified slide-level representation that a lightweight classification head transforms into task-specific probabilities.
Key Improvements
FlexAttention +
torch.compile
What changed: Replaced vanilla multi‑head self‑attention with IO‑aware FlexAttention kernels and enabledtorch.compile
to fuse the forward/backward graph at runtime. The new kernel layout dramatically improves both memory efficiency and training-and-inference throughput.Coordinate‑aware Relative Bias
What changed: Added an ALiBi‑style distance bias that is computed from the (x, y) patch coordinates themselves, allowing the ViT aggregator to reason about spatial proximity.Scalable Mixed‑Omics Encoder (Token‑mixing Transformer)
What changed: Each omics modality is first tokenised into a fixed‑length set. All modality‑specific tokens are concatenated into a single sequence and passed through a shared multi‑head self‑attention stack, enabling direct information exchange across modalities in one shot. The aggregated omics representation is subsequently fused with image tokens via cross‑attention. This release uses three modalities (RNA, CNV, DNA‑methylation), but the design is agnostic to modality count and scales linearly with token number.
Quick Start
Requirements
- NVIDIA GPU is required
- Minimum 40GB GPU memory recommended
- Tested on Ubuntu 22.04 with NVIDIA driver version 550.144.03
Installation
pip install -r requirements.txt
Quick Inference
from models.exaonepath import EXAONEPathV1p5Downstream
hf_token = "YOUR_HUGGING_FACE_ACCESS_TOKEN"
model = EXAONEPathV1p5Downstream.from_pretrained(
"LGAI-EXAONE/EXAONE-Path-MSI",
use_auth_token=hf_token
)
probs = model("./samples/MSI_high.svs")
print(f"P(CRCMSI) = {probs[1]:.3f}")
Command‑line
python inference.py --svs_dir ./samples
Model Performance Comparison
Metric (AUC) / Task | Titan (Conch v1.5 + iBot, image-text) | PRISM (virchow + perceiver, image-text) | CHIEF (CTransPath + CLAM, image-text, WSI-contrastive) | Prov-GigaPath (GigaPath + LongNet, image-only, mask-prediction) | UNI2-h + CLAM (image-only) | EXAONEPath V1.5 | EXAONE Path MSI |
---|---|---|---|---|---|---|---|
CRC-MSI | 0.9370 | 0.9432 | 0.9273 | 0.9541 | 0.9808 | 0.9537 | 0.9844 |
LUAD-TMB (cutoff 10) | 0.6901 | 0.6445 | 0.6501 | 0.6744 | 0.6686 | 0.6846 | 0.6842 |
LUAD-EGFR-mut | 0.8197 | 0.8152 | 0.7691 | 0.7623 | 0.8577 | 0.7607 | 0.8564 |
LUAD-KRAS-mut | 0.5405 | 0.6299 | 0.4676 | 0.5110 | 0.4690 | 0.5480 | 0.6038 |
BRCA-ER | 0.9343 | 0.8998 | 0.9115 | 0.9186 | 0.9454 | 0.9096 | 0.9278 |
BRCA-PR | 0.8804 | 0.8613 | 0.8470 | 0.8595 | 0.8770 | 0.8215 | 0.8430 |
BRCA-HER2 | 0.8046 | 0.8154 | 0.7822 | 0.7891 | 0.8322 | 0.7811 | 0.8050 |
BRCA-TP53 | 0.7879 | 0.8415 | 0.7879 | 0.7388 | 0.8080 | 0.6607 | 0.7656 |
BRCA-PIK3CA | 0.7577 | 0.8929 | 0.7015 | 0.7347 | 0.8571 | 0.7066 | 0.7908 |
RCC-PBRM1 | 0.6383 | 0.5570 | 0.5129 | 0.5270 | 0.5011 | 0.4445 | 0.5780 |
RCC-BAP1 | 0.7188 | 0.7690 | 0.7310 | 0.6970 | 0.7160 | 0.7337 | 0.7323 |
COAD-KRAS | 0.7642 | 0.7443 | 0.6989 | 0.8153 | 0.9432 | 0.6790 | 0.8693 |
COAD-TP53 | 0.8889 | 0.8160 | 0.7014 | 0.7118 | 0.7830 | 0.8785 | 0.8715 |
Average | 0.7817 | 0.7869 | 0.7299 | 0.7457 | 0.7876 | 0.7356 | 0.7932 |
License
The model is licensed under EXAONEPath AI Model License Agreement 1.0 - NC
- Downloads last month
- 5