marmg commited on
Commit
76e9ef7
·
verified ·
1 Parent(s): 5da669f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +110 -3
README.md CHANGED
@@ -1,3 +1,110 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ inference: false
4
+ datasets:
5
+ - ibm-research/otter_uniprot_bindingdb
6
+ ---
7
+
8
+ # Otter UB MF Model Card
9
+
10
+ Otter-Knoweldge model trained using only one modality for molecules: Molformer (MOLF)
11
+
12
+ ## Model details
13
+ Otter models are based on Graph Neural Networks (GNN) that propagates initial embeddings through a set of layers that upgrade input embedding according to the node neighbours.
14
+ The architecture of GNN consists of two main blocks: encoder and decoder.
15
+ - For encoder we first define a projection layer which consists of a set of linear transformations for each node modality and projects nodes into common dimensionality, then we apply several multi-relational graph convolutional layers (R-GCN) which distinguish between different types of edges between source and target nodes by having a set of trainable parameters for each edge type.
16
+ - For decoder we consider link prediction task, which consists of a scoring function that maps each triple of source and target nodes and the corresponding edge and maps that to a scalar number defined over interval [0; 1].
17
+
18
+
19
+ **Model training data:**
20
+
21
+ The model was trained over *Uniprot-BindingDB*
22
+
23
+
24
+ **Paper or resources for more information:**
25
+ - [GitHub Repo](https://github.com/IBM/otter-knowledge)
26
+ - [Paper](https://arxiv.org/abs/2306.12802)
27
+
28
+ **License:**
29
+
30
+ MIT
31
+
32
+ **Where to send questions or comments about the model:**
33
+ - [GitHub Repo](https://github.com/IBM/otter-knowledge)
34
+
35
+ ## How to use
36
+
37
+ Clone the repo:
38
+ ```sh
39
+ git clone https://github.com/IBM/otter-knowledge.git
40
+ cd otter-knowledge
41
+ ```
42
+
43
+ - Use the BindingAffinity Class:
44
+
45
+ ```python
46
+ import torch
47
+ from torch import nn
48
+
49
+
50
+ class BindingAffinity(nn.Module):
51
+
52
+ def __init__(self, gnn, drug_modality):
53
+ super(BindingAffinity, self).__init__()
54
+ self.drug_modality = drug_modality
55
+ self.protein_modality = 'protein-sequence-mean'
56
+ self.drug_entity_name = 'Drug'
57
+ self.protein_entity_name = 'Protein'
58
+ self.drug_rel_id = 1
59
+ self.protein_rel_id = 2
60
+ self.protein_drug_rel_id = 0
61
+ self.gnn = gnn
62
+ self.device = 'cpu'
63
+ hd1 = 512
64
+ num_input = 2
65
+ self.combine = torch.nn.ModuleList([nn.Linear(num_input * hd1, hd1), nn.ReLU(),
66
+ nn.Linear(hd1, hd1), nn.ReLU(),
67
+ nn.Linear(hd1, 1)])
68
+ self.to(self.device)
69
+
70
+ def forward(self, drug_embedding, protein_embedding):
71
+ nodes = {
72
+ self.drug_modality: {
73
+ 'embeddings': drug_embedding.unsqueeze(0).to(self.device),
74
+ 'node_indices': torch.tensor([1]).to(self.device)
75
+ },
76
+ self.drug_entity_name: {
77
+ 'embeddings': [None],
78
+ 'node_indices': torch.tensor([0]).to(self.device)
79
+ },
80
+ self.protein_modality: {
81
+ 'embeddings': protein_embedding.unsqueeze(0).to(self.device),
82
+ 'node_indices': torch.tensor([3]).to(self.device)
83
+ },
84
+ self.protein_entity_name: {
85
+ 'embeddings': [None],
86
+ 'node_indices': torch.tensor([2]).to(self.device)
87
+ }
88
+ }
89
+ triples = torch.tensor([[1, 3],
90
+ [3, 4],
91
+ [0, 2]]).to(self.device)
92
+ gnn_embeddings = self.gnn.encoder(nodes, triples)
93
+ node_gnn_embeddings = []
94
+ all_indices = [0, 2]
95
+
96
+ for indices in all_indices:
97
+ node_gnn_embedding = torch.index_select(gnn_embeddings, dim=0, index=torch.tensor(indices).to(self.device))
98
+ node_gnn_embeddings.append(node_gnn_embedding)
99
+
100
+ c = torch.cat(node_gnn_embeddings, dim=-1)
101
+ for m in self.combine:
102
+ c = m(c)
103
+
104
+ return c```
105
+
106
+ - Run the inference with the initial embeddings (embeddings obtained after using the handlers (Molformer, ESM1b) over the SMILES and the protein sequence):
107
+
108
+ ```python
109
+ p = net(drug_embedding=drug_embedding, protein_embedding=protein_embedding)
110
+ print(p)```