MCCPBR
/

EAGLE

@@ -1,53 +1,53 @@
----
-license: cc-by-nc-sa-4.0
-language:
-- en
-tags:
-- pathology
-- computational pathology
-- LUAD
-- EGFR
-- computational biomarkers
----
-# EAGLE
-The use of artificial intelligence (AI) models to develop computational biomarkers from H&E-stained digital histopathology images has emerged as a promising diagnostic approach for enhancing clinical management for cancer patients. Computational biomarkers offer several advantages: 1) they are digitally deployed, 2) cost-effective, and 3) do not consume tissue. Despite numerous promising models in the literature, their clinical utility in real-world settings has yet to be established. Assessment for *EGFR* mutations in lung adenocarcinoma is challenged by a need for rapid, accurate results at a low cost while preserving tissue for comprehensive genomic sequencing. Polymerase chain reaction (PCR)-based assays are used to provide rapid results but are less accurate than genomic sequencing and deplete the tissue. Highly accurate and robust computational biomarkers, aided by use of modern foundation models, can fill such a niche. We compiled the largest, international, multi-institutional clinical cohort of digital histopathology images of lung adenocarcinomas (N=8461 cases/slides) to develop and validate a state-of-the-art computational *EGFR* biomarker. The model utilizes an open source foundation model that is fine-tuned for the task of *EGFR* classification. We demonstrate that fine-tuning the foundation model results in improved task-specific performance that generalizes across institutions and scanning protocols with clinical-level performance (mean AUC: internal 0.847, external 0.870). To realize the translation into the clinic as well as investigate its in-real-time (IRT) usability, we conducted the first-of-its-kind prospective silent trial of a computational biomarker on primary samples, achieving an AUC of 0.896. We demonstrate that an AI assisted rapid *EGFR* screening workflow reduces the amount of rapid testing needed by up to 43% while maintaining clinical standard performance. The retrospective and prospective results demonstrate for the first time the clinical utility and efficacy of an H&E-based computational biomarker in a real-world clinical setting.
-## Model
-The model consists of: 1) a 1.1 billion parameter vision transformer (ViT-g) that encodes high-resolution (20x magnification, 0.5 microns per pixel) 224-pixel patches into a 1,536 feature vector; 2) a gated MIL attention (GMA) aggregator that integrates all encoded patches from a slide into a global slide-level feature representation; and 3) a linear classifier that outputs the probability of an *EGFR* mutation based on the input slide data.
-The tile encoder was initialized with [GigaPath](https://huggingface.co/prov-gigapath/prov-gigapath). The model was trained end-to-end for the task of predicting *EGFR* mutational status from H&E slides.
-## Model Usage
-To get started, first clone the repository with this command:
-```bash
-  git clone --no-checkout https://huggingface.co/MCCPBR/EAGLE && cd EAGLE
-```
-Now you can use the following code:
-```python
-from PIL import Image
-import numpy as np
-import eagle
-import torch
-import torchvision.transforms as transforms
-# Load model
-model = eagle.EAGLE()
-# Set up transform
-transform = transforms.Compose([
-    transforms.ToTensor(),
-    transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
-])
-# Image
-img = np.random.randint(0, 256, size=224*224*3).reshape(224,224,3).astype(np.uint8)
-img = Image.fromarray(img)
-img = transform(img).unsqueeze(0)
-# Inference
-with torch.no_grad():
-    h, att, p = model(img)
 ```

+---
+license: cc-by-nc-sa-4.0
+language:
+- en
+tags:
+- pathology
+- computational pathology
+- LUAD
+- EGFR
+- computational biomarkers
+---
+# EAGLE
+Thee use of artificial intelligence (AI) models to develop computational biomarkers from H&E-stained digital histopathology images has emerged as a promising diagnostic approach for enhancing clinical management for cancer patients. Computational biomarkers offer several advantages: 1) they are digitally deployed, 2) cost-effective, and 3) do not consume tissue. Despite numerous promising models in the literature, their clinical utility in real-world settings has yet to be established. Assessment for *EGFR* mutations in lung adenocarcinoma is challenged by a need for rapid, accurate results at a low cost while preserving tissue for comprehensive genomic sequencing. Polymerase chain reaction (PCR)-based assays are used to provide rapid results but are less accurate than genomic sequencing and deplete the tissue. Highly accurate and robust computational biomarkers, aided by use of modern foundation models, can fill such a niche. We compiled the largest, international, multi-institutional clinical cohort of digital histopathology images of lung adenocarcinomas (N=8461 cases/slides) to develop and validate a state-of-the-art computational *EGFR* biomarker. The model utilizes an open source foundation model that is fine-tuned for the task of *EGFR* classification. We demonstrate that fine-tuning the foundation model results in improved task-specific performance that generalizes across institutions and scanning protocols with clinical-level performance (mean AUC: internal 0.847, external 0.870). To realize the translation into the clinic as well as investigate its in-real-time (IRT) usability, we conducted the first-of-its-kind prospective silent trial of a computational biomarker on primary samples, achieving an AUC of 0.896. We demonstrate that an AI assisted rapid *EGFR* screening workflow reduces the amount of rapid testing needed by up to 43% while maintaining clinical standard performance. The retrospective and prospective results demonstrate for the first time the clinical utility and efficacy of an H&E-based computational biomarker in a real-world clinical setting.
+## Model
+The model consists of: 1) a 1.1 billion parameter vision transformer (ViT-g) that encodes high-resolution (20x magnification, 0.5 microns per pixel) 224-pixel patches into a 1,536 feature vector; 2) a gated MIL attention (GMA) aggregator that integrates all encoded patches from a slide into a global slide-level feature representation; and 3) a linear classifier that outputs the probability of an *EGFR* mutation based on the input slide data.
+The tile encoder was initialized with [GigaPath](https://huggingface.co/prov-gigapath/prov-gigapath). The model was trained end-to-end for the task of predicting *EGFR* mutational status from H&E slides.
+## Model Usage
+To get started, first clone the repository with this command:
+```bash
+  git clone --no-checkout https://huggingface.co/MCCPBR/EAGLE && cd EAGLE
+```
+Now you can use the following code:
+```python
+from PIL import Image
+import numpy as np
+import eagle
+import torch
+import torchvision.transforms as transforms
+# Load model
+model = eagle.EAGLE()
+# Set up transform
+transform = transforms.Compose([
+    transforms.ToTensor(),
+    transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
+])
+# Image
+img = np.random.randint(0, 256, size=224*224*3).reshape(224,224,3).astype(np.uint8)
+img = Image.fromarray(img)
+img = transform(img).unsqueeze(0)
+# Inference
+with torch.no_grad():
+    h, att, p = model(img)
 ```