LGAI-EXAONE/EXAONE-Path-MSI

Introduction

EXAONE Path MSI is an enhanced whole-slide image (WSI) classification framework that retains the core architecture of EXAONE Path while upgrading its internals for greater efficiency and richer multimodal integration.

The pipeline still unfolds in two stages:

Patch-wise feature extraction – Each WSI is tiled into 256 × 256 px patches, which are embedded into 768-dimensional vectors using the frozen EXAONE Path encoder.
Slide-level aggregation – The patch embeddings are aggregated using a Vision Transformer, producing a unified slide-level representation that a lightweight classification head transforms into task-specific probabilities.

Key Improvements

FlexAttention + torch.compile
What changed: Replaced vanilla multi‑head self‑attention with IO‑aware FlexAttention kernels and enabled torch.compile to fuse the forward/backward graph at runtime. The new kernel layout dramatically improves both memory efficiency and training-and-inference throughput.
Coordinate‑aware Relative Bias
What changed: Added an ALiBi‑style distance bias that is computed from the (x, y) patch coordinates themselves, allowing the ViT aggregator to reason about spatial proximity.
Scalable Mixed‑Omics Encoder (Token‑mixing Transformer)
What changed: Each omics modality is first tokenised into a fixed‑length set. All modality‑specific tokens are concatenated into a single sequence and passed through a shared multi‑head self‑attention stack, enabling direct information exchange across modalities in one shot. The aggregated omics representation is subsequently fused with image tokens via cross‑attention. This release uses three modalities (RNA, CNV, DNA‑methylation), but the design is agnostic to modality count and scales linearly with token number.

Quick Start

Requirements

NVIDIA GPU is required
Minimum 40GB GPU memory recommended
Tested on Ubuntu 22.04 with NVIDIA driver version 550.144.03

Installation

pip install -r requirements.txt

Quick Inference

from models.exaonepath import EXAONEPathV1p5Downstream
 
hf_token = "YOUR_HUGGING_FACE_ACCESS_TOKEN"
model = EXAONEPathV1p5Downstream.from_pretrained(
    "LGAI-EXAONE/EXAONE-Path-MSI",
    use_auth_token=hf_token
)
probs = model("./samples/MSI_high.svs")
print(f"P(CRCMSI) = {probs[1]:.3f}")

Command‑line

python inference.py --svs_dir ./samples

Model Performance Comparison

Metric (AUC) / Task	Titan (Conch v1.5 + iBot, image-text)	PRISM (virchow + perceiver, image-text)	CHIEF (CTransPath + CLAM, image-text, WSI-contrastive)	Prov-GigaPath (GigaPath + LongNet, image-only, mask-prediction)	UNI2-h + CLAM (image-only)	EXAONEPath V1.5	EXAONE Path MSI
CRC-MSI	0.9370	0.9432	0.9273	0.9541	0.9808	0.9537	0.9844
LUAD-TMB (cutoff 10)	0.6901	0.6445	0.6501	0.6744	0.6686	0.6846	0.6842
LUAD-EGFR-mut	0.8197	0.8152	0.7691	0.7623	0.8577	0.7607	0.8564
LUAD-KRAS-mut	0.5405	0.6299	0.4676	0.5110	0.4690	0.5480	0.6038
BRCA-ER	0.9343	0.8998	0.9115	0.9186	0.9454	0.9096	0.9278
BRCA-PR	0.8804	0.8613	0.8470	0.8595	0.8770	0.8215	0.8430
BRCA-HER2	0.8046	0.8154	0.7822	0.7891	0.8322	0.7811	0.8050
BRCA-TP53	0.7879	0.8415	0.7879	0.7388	0.8080	0.6607	0.7656
BRCA-PIK3CA	0.7577	0.8929	0.7015	0.7347	0.8571	0.7066	0.7908
RCC-PBRM1	0.6383	0.5570	0.5129	0.5270	0.5011	0.4445	0.5780
RCC-BAP1	0.7188	0.7690	0.7310	0.6970	0.7160	0.7337	0.7323
COAD-KRAS	0.7642	0.7443	0.6989	0.8153	0.9432	0.6790	0.8693
COAD-TP53	0.8889	0.8160	0.7014	0.7118	0.7830	0.8785	0.8715
Average	0.7817	0.7869	0.7299	0.7457	0.7876	0.7356	0.7932

License

The model is licensed under EXAONEPath AI Model License Agreement 1.0 - NC