File size: 2,275 Bytes
ad1dc7f
 
 
defd546
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
license: cc-by-4.0
---
# Survival prediction using PORPOISE (TCGA GBMLGG)

This model predicts a patient's overall survival using an H&E-stained digital pathology image of GBMLGG. It was trained by Jakub Kaczmarzyk using PORPOISE. It is an attempt to reproduce the PORPOISE manuscript.

Original journal article: https://doi.org/10.1016/j.ccell.2022.07.004

If you find this model useful, please make sure you cite the original publication.

Inputs: Bag of patches with 128um edge length, embedded with CTransPath.

Output classes: logits of hazards at four timepoints

To calculate the arbitrary risk score given the model outputs `logits`, use the following:

```python
hazards = torch.sigmoid(logits)
S = torch.cumprod(1 - hazards, dim=1)
risk = -torch.sum(S, dim=1)
```

## Data

TCGA-GBMLGG was used to train the model. The whole slide images were tiled into 128x128um patches, and each patch was encoded using CTransPath (this produces 768-dimensional embeddings).

The training and validation splits were provided by the original PORPOISE code. Here, we report the model in fold 3, because it had the highest c-index of the folds.

Samples sizes:
- Train: 810 slides (455 patients)
- Validation: 201 slides (114 patients)

## Reusing this model

To use this model on the command line, see [WSInfer-MIL](https://github.com/kaczmarj/wsinfer-mil).

Alternatively, you may use PyTorch on ONNX to run the model. First, embed 128um x 128um patches using CTransPath. Then pass the bag of embeddings to the model.

```python
import onnxruntime as ort
import numpy as np
embedding = np.ones((1_000, 768), dtype="float32")
ort_sess = ort.InferenceSession("model.onnx")
logits, attention = ort_sess.run(["logits", "attention"], {'input': embedding})
# To get the risk score, implement the following:
# hazards = sigmoid(logits)
# S = cumprod(1 - hazards, dim=1)
# risk = -sum(S, dim=1)
```

The median risk score was -3.22, and this value was used to split patients into low risk and high risk.

## Model performance

The model achieves a c-index of 0.83 in the validation set.

# Intended uses

This model is ONLY intended for research purposes.

**This model may not be used for clinical purposes.** This model is distributed without warranties, either express or implied.