File size: 3,914 Bytes
42d4119
 
 
 
 
 
e9c1ea1
 
0b6215c
e9c1ea1
0b6215c
 
e9c1ea1
0b6215c
 
 
 
 
 
e9c1ea1
0b6215c
 
 
c0e9ad1
0b6215c
ef54e34
 
0b6215c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e9c1ea1
0b6215c
e9c1ea1
0b6215c
e9c1ea1
0b6215c
 
 
 
 
e9c1ea1
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
---
license: mit
datasets:
- tahoebio/Tahoe-100M
tags:
- tahoe-deepdive
---

# Frameshift Team Submission – Tahoe-DeepDive Hackathon 2025

## Team Name
**Frameshift**

## Members
- Jesus Gonzalez Ferrer, UCSC β€” [@JesusGF1](https://github.com/JesusGF1)
- Carlota Pereda, UCSF β€” [@carlotapereda](https://github.com/carlotapereda)
- Laura Almonte, UCSF β€” [@almonteloya](https://github.com/almonteloya)
- Aidan Winters, Arc Institute/UCSF β€” [@aidanwinters](https://github.com/aidanwinters)
- Michael Kosicki, LBL β€” [@lotard](https://github.com/lotard)

---

## Project
The project code can be found in the following github Repo: [frameshift](https://github.com/almonteloya/frameshift)

The slideshow can be found in the following google slides: [slideshow](https://docs.google.com/presentation/d/1u6BGhs_5Xd9IMpzrzD3wI1aiH-8z9j0V80ukDE08NoY/edit?usp=sharing)

### Title
**Defining context-specific responses to drug perturbations in Tahoe 100M dataset**

### Overview
Personalized (i.e. context-specific) treatments lead to better cancer outcomes.  
We want to develop a framework that measures how drugs affect cells differently based on their genetic context, and explains the genetic programs that cells use to respond.  
We define context-specificity as genotype-, cell line-, tissue-of-origin-, and patient-specific effects on gene expression.

### Motivation
Drugs don't work the same way for everyone. Oncotherapies sometimes lack efficacy and tend to be indiscriminate and toxic.  
Broad-acting chemotherapies are effective but are limited by patient side effects.  
We need better ways of stratifying patients, selecting adequate treatments, and simulating adverse effects before they happen.

---

## Methods

### Data Selection
We applied an array of methods to a subset of the Tahoe-100M dataset.  
We focused on cell lines with **KRAS gain-of-function mutations**, especially **G12C**.  
Selected drugs included known KRAS inhibitors, positive controls, and negative controls.

### E-distance
- Used precomputed `scVi` embeddings from Tahoe-100M.
- Calculated distances to plate-paired `DMSO_TF` for each drug and cell line.
- Visualized results.

### MSE
- Applied similar steps as E-distance.
- Started from **pseudobulk samples** provided in the dataset.

### Augur
- A **scRNA classifier** to quantify separability between control and perturbed groups.
- Score of 1 indicates high separability.
- Applied across all cell lines and drug perturbations.

### CellCap
- A **generative model** for perturbation data.
- Models correspondence between basal state and measured perturbation.
- Learns interpretable response programs as weighted gene sets.

---

## Results

- **E-distance** and **MSE** failed to detect context-specific drug effects across selected KRAS cell lines.
- **Augur** and **CellCap**:
  - Detected strong responses in **KRAS-G12C** lines.
  - Captured cell-specific gene expression programs linked to KRAS mutations.

---

## Discussion

The discovery of novel cancer therapies is limited by the lack of generalizable experimental and computational workflows. In a proof-of-concept analysis, we tested four computational methods on the Tahoe-100M dataset for identifying context-specific responses to KRAS inhibitors.

- **Augur** and **CellCap** succeeded in detecting KRAS-inhibitor effects in KRAS-G12C cell lines.
- **E-distance** and **MSE** failed to differentiate responses.

We hypothesize that the success of Augur and CellCap lies in their ability to utilize **local, pathway-level expression** rather than global transcriptomic changes.

Preliminary results highlight genes associated with the **Ras-Raf pathway**, suggesting a targeted effect by the drugs.

### Future Directions
We aim to:
- Scale our approach to all cell lines and drugs in Tahoe-100M.
- Identify potential **cell-type specific drugs**.
- Propose **candidates for clinical development**.