Mamba2-mRNA

Mamba2-mRNA is a state-space model built on the Mamba2 architecture, trained at single-nucleotide resolution. This innovative model offers several advantages, including faster processing speeds compared to traditional transformer models, efficient handling of long sequences, and reduced memory requirements. Its state-space approach enables better modeling of biological sequences by capturing both local and long-range dependencies in mRNA data. The single-nucleotide resolution allows for precise prediction and analysis of genetic elements.

Helical

Install the package

Run the following to install the Helical package via pip:

pip install --upgrade helical

Generate Embeddings

from helical import Mamba2mRNA, Mamba2mRNAConfig
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"

input_sequences = ["ACU"*20, "AUG"*20, "AUG"*20, "ACU"*20, "AUU"*20]

mamba2_mrna_config = Mamba2mRNAConfig(batch_size=5, device=device)
mamba2_mrna = Mamba2mRNA(configurer=mamba2_mrna_config)

# prepare data for input to the model
processed_input_data = mamba2_mrna.process_data(input_sequences)

# generate the embeddings for the input data
embeddings = mamba2_mrna.get_embeddings(processed_input_data)

Fine-Tuning

Classification fine-tuning example:

from helical import Mamba2mRNAFineTuningModel, Mamba2mRNAConfig
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"

input_sequences = ["ACU"*20, "AUG"*20, "AUG"*20, "ACU"*20, "AUU"*20]
labels = [0, 2, 2, 0, 1]

mamba2_mrna_config = Mamba2mRNAConfig(batch_size=5, device=device, max_length=100)
mamba2_mrna_fine_tune = Mamba2mRNAFineTuningModel(mamba2_mrna_config=mamba2_mrna_config, fine_tuning_head="classification", output_size=3)

# prepare data for input to the model
train_dataset = mamba2_mrna_fine_tune.process_data(input_sequences)

# fine-tune the model with the relevant training labels
mamba2_mrna_fine_tune.train(train_dataset=train_dataset, train_labels=labels)

# get outputs from the fine-tuned model on a processed dataset
outputs = mamba2_mrna_fine_tune.get_outputs(train_dataset)

Cite the package

@software{allard_2024_13135902,
  author       = {Helical Team},
  title        = {helicalAI/helical: v0.0.1-alpha10},
  month        = nov,
  year         = 2024,
  publisher    = {Zenodo},
  version      = {0.0.1a10},
  doi          = {10.5281/zenodo.13135902},
  url          = {https://doi.org/10.5281/zenodo.13135902}
}