File size: 1,398 Bytes
ad89901
 
 
ad0caba
 
 
 
 
 
 
 
ad89901
d3aef1a
 
ad0caba
d3aef1a
 
ad89901
 
ad0caba
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
datasets:
- genbio-ai/transcript_isoform_expression_prediction
metrics:
- spearmanr
- r_squared
base_model:
- genbio-ai/AIDO.RNA-1.6B-CDS
- EleutherAI/enformer-official-rough
tags:
- biology
---
# Bi-modal model for RNA isoform expression prediction

## RNA isoform expression prediction
* Input: dna_seq, rna_seq
* Output: expression level in 30 tissues


## Model architecture 
* Backbones: 
  * DNA: Enformer (fully finetuning)
  * RNA: AIDO.RNA-1.6B-CDS (lora finetuning)
* Fusion method: concat fusion


## Usage

**Download model**
```python
from huggingface_hub import snapshot_download
from pathlib import Path

model_name = "genbio-ai/AIDO.MM-Enformer-RNA-1.6B-CDS-ConcatFusion-rna-isoform-expression-ckpt"
genbio_models_path = Path.home().joinpath('genbio_models', model_name)
genbio_models_path.mkdir(parents=True, exist_ok=True)
snapshot_download(repo_id=model_name, local_dir=genbio_models_path)
```

**Evaluation script**

Once you download the model, you can use the model in [ModelGenertor](https://github.com/genbio-ai/ModelGenerator) using the following script:
```bash
CONFIG_FILE=...     # put the config file path here
CKPT_PATH=...       # put the model checkpoint path here

mgen test --config $CONFIG_FILE \
    --data.batch_size 16 \
    --trainer.logger null \
    --model.strict_loading False \
    --model.reset_optimizer_states True \
    --ckpt_path $CKPT_PATH
```