Solve-for-Earth-AI commited on
Commit
bfd5f2d
·
verified ·
1 Parent(s): a423288

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -52
README.md CHANGED
@@ -1,53 +1,53 @@
1
- ---
2
- license: cc-by-nc-sa-4.0
3
- language:
4
- - en
5
- tags:
6
- - pathology
7
- - computational pathology
8
- - LUAD
9
- - EGFR
10
- - computational biomarkers
11
- ---
12
-
13
- # EAGLE
14
-
15
- The use of artificial intelligence (AI) models to develop computational biomarkers from H&E-stained digital histopathology images has emerged as a promising diagnostic approach for enhancing clinical management for cancer patients. Computational biomarkers offer several advantages: 1) they are digitally deployed, 2) cost-effective, and 3) do not consume tissue. Despite numerous promising models in the literature, their clinical utility in real-world settings has yet to be established. Assessment for *EGFR* mutations in lung adenocarcinoma is challenged by a need for rapid, accurate results at a low cost while preserving tissue for comprehensive genomic sequencing. Polymerase chain reaction (PCR)-based assays are used to provide rapid results but are less accurate than genomic sequencing and deplete the tissue. Highly accurate and robust computational biomarkers, aided by use of modern foundation models, can fill such a niche. We compiled the largest, international, multi-institutional clinical cohort of digital histopathology images of lung adenocarcinomas (N=8461 cases/slides) to develop and validate a state-of-the-art computational *EGFR* biomarker. The model utilizes an open source foundation model that is fine-tuned for the task of *EGFR* classification. We demonstrate that fine-tuning the foundation model results in improved task-specific performance that generalizes across institutions and scanning protocols with clinical-level performance (mean AUC: internal 0.847, external 0.870). To realize the translation into the clinic as well as investigate its in-real-time (IRT) usability, we conducted the first-of-its-kind prospective silent trial of a computational biomarker on primary samples, achieving an AUC of 0.896. We demonstrate that an AI assisted rapid *EGFR* screening workflow reduces the amount of rapid testing needed by up to 43% while maintaining clinical standard performance. The retrospective and prospective results demonstrate for the first time the clinical utility and efficacy of an H&E-based computational biomarker in a real-world clinical setting.
16
-
17
- ## Model
18
- The model consists of: 1) a 1.1 billion parameter vision transformer (ViT-g) that encodes high-resolution (20x magnification, 0.5 microns per pixel) 224-pixel patches into a 1,536 feature vector; 2) a gated MIL attention (GMA) aggregator that integrates all encoded patches from a slide into a global slide-level feature representation; and 3) a linear classifier that outputs the probability of an *EGFR* mutation based on the input slide data.
19
- The tile encoder was initialized with [GigaPath](https://huggingface.co/prov-gigapath/prov-gigapath). The model was trained end-to-end for the task of predicting *EGFR* mutational status from H&E slides.
20
-
21
- ## Model Usage
22
-
23
- To get started, first clone the repository with this command:
24
- ```bash
25
- git clone --no-checkout https://huggingface.co/MCCPBR/EAGLE && cd EAGLE
26
- ```
27
-
28
- Now you can use the following code:
29
- ```python
30
- from PIL import Image
31
- import numpy as np
32
- import eagle
33
- import torch
34
- import torchvision.transforms as transforms
35
-
36
- # Load model
37
- model = eagle.EAGLE()
38
-
39
- # Set up transform
40
- transform = transforms.Compose([
41
- transforms.ToTensor(),
42
- transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
43
- ])
44
-
45
- # Image
46
- img = np.random.randint(0, 256, size=224*224*3).reshape(224,224,3).astype(np.uint8)
47
- img = Image.fromarray(img)
48
- img = transform(img).unsqueeze(0)
49
-
50
- # Inference
51
- with torch.no_grad():
52
- h, att, p = model(img)
53
  ```
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - pathology
7
+ - computational pathology
8
+ - LUAD
9
+ - EGFR
10
+ - computational biomarkers
11
+ ---
12
+
13
+ # EAGLE
14
+
15
+ Thee use of artificial intelligence (AI) models to develop computational biomarkers from H&E-stained digital histopathology images has emerged as a promising diagnostic approach for enhancing clinical management for cancer patients. Computational biomarkers offer several advantages: 1) they are digitally deployed, 2) cost-effective, and 3) do not consume tissue. Despite numerous promising models in the literature, their clinical utility in real-world settings has yet to be established. Assessment for *EGFR* mutations in lung adenocarcinoma is challenged by a need for rapid, accurate results at a low cost while preserving tissue for comprehensive genomic sequencing. Polymerase chain reaction (PCR)-based assays are used to provide rapid results but are less accurate than genomic sequencing and deplete the tissue. Highly accurate and robust computational biomarkers, aided by use of modern foundation models, can fill such a niche. We compiled the largest, international, multi-institutional clinical cohort of digital histopathology images of lung adenocarcinomas (N=8461 cases/slides) to develop and validate a state-of-the-art computational *EGFR* biomarker. The model utilizes an open source foundation model that is fine-tuned for the task of *EGFR* classification. We demonstrate that fine-tuning the foundation model results in improved task-specific performance that generalizes across institutions and scanning protocols with clinical-level performance (mean AUC: internal 0.847, external 0.870). To realize the translation into the clinic as well as investigate its in-real-time (IRT) usability, we conducted the first-of-its-kind prospective silent trial of a computational biomarker on primary samples, achieving an AUC of 0.896. We demonstrate that an AI assisted rapid *EGFR* screening workflow reduces the amount of rapid testing needed by up to 43% while maintaining clinical standard performance. The retrospective and prospective results demonstrate for the first time the clinical utility and efficacy of an H&E-based computational biomarker in a real-world clinical setting.
16
+
17
+ ## Model
18
+ The model consists of: 1) a 1.1 billion parameter vision transformer (ViT-g) that encodes high-resolution (20x magnification, 0.5 microns per pixel) 224-pixel patches into a 1,536 feature vector; 2) a gated MIL attention (GMA) aggregator that integrates all encoded patches from a slide into a global slide-level feature representation; and 3) a linear classifier that outputs the probability of an *EGFR* mutation based on the input slide data.
19
+ The tile encoder was initialized with [GigaPath](https://huggingface.co/prov-gigapath/prov-gigapath). The model was trained end-to-end for the task of predicting *EGFR* mutational status from H&E slides.
20
+
21
+ ## Model Usage
22
+
23
+ To get started, first clone the repository with this command:
24
+ ```bash
25
+ git clone --no-checkout https://huggingface.co/MCCPBR/EAGLE && cd EAGLE
26
+ ```
27
+
28
+ Now you can use the following code:
29
+ ```python
30
+ from PIL import Image
31
+ import numpy as np
32
+ import eagle
33
+ import torch
34
+ import torchvision.transforms as transforms
35
+
36
+ # Load model
37
+ model = eagle.EAGLE()
38
+
39
+ # Set up transform
40
+ transform = transforms.Compose([
41
+ transforms.ToTensor(),
42
+ transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
43
+ ])
44
+
45
+ # Image
46
+ img = np.random.randint(0, 256, size=224*224*3).reshape(224,224,3).astype(np.uint8)
47
+ img = Image.fromarray(img)
48
+ img = transform(img).unsqueeze(0)
49
+
50
+ # Inference
51
+ with torch.no_grad():
52
+ h, att, p = model(img)
53
  ```