probablybots commited on
Commit
33167ba
·
verified ·
1 Parent(s): f29ce92

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -4,7 +4,7 @@ license: other
4
 
5
  # AIDO.Protein-RAG-16B
6
 
7
- AIDO.RAGProtein-16B is a multimodal protein language model that integrates Multiple Sequence Alignment (MSA) and structural data, building upon the [AIDO.Protein-16B](https://huggingface.co/genbio-ai/AIDO.Protein-16B) foundation. The training process comprises three main stages:
8
 
9
  1. 2D RoPE encoding fine-tuning
10
  2. Initial training on 100 billion tokens from UniRef50/UniClust30 MSA data
@@ -12,7 +12,7 @@ AIDO.RAGProtein-16B is a multimodal protein language model that integrates Multi
12
 
13
  ## Model Architecture Details
14
 
15
- AIDO.RAGProtein-16B employs a transformer encoder-only architecture featuring sparse Mixture-of-Experts (MoE) layers that replace dense MLP layers in each transformer block. Utilizing single amino acid tokenization and optimized through masked language modeling (MLM), the model activates 2 experts per token via top-2 routing mechanisms.
16
 
17
  <center><img src="proteinmoe_architecture.png" alt="An Overview of AIDO.Protein" style="width:70%; height:auto;" /></center>
18
 
@@ -29,9 +29,9 @@ More architecture details are shown below:
29
  | Vocab Size | 44 |
30
  | Context Length | 2048 |
31
 
32
- ## Pre-training of AIDO.RAGProtein-16B
33
 
34
- Here we briefly introduce the details of pre-training of AIDO.RAGProtein-16B. Mainly divided into three stages: (1) 1D -> 2D RoPE encoding finetuning; (2) UniRef50/Uniclust30 MSA finetuning; (3) AlphaFold Database MSA & Structure tokens finetuning.
35
 
36
  ### Data
37
 
@@ -202,7 +202,7 @@ print(logits.shape)
202
 
203
  # Citation
204
 
205
- Please cite AIDO.RAGProtein-16B using the following BibTex code:
206
 
207
  ```
208
  @inproceedings{sun_mixture_2024,
 
4
 
5
  # AIDO.Protein-RAG-16B
6
 
7
+ AIDO.Protein-RAG-16B is a multimodal protein language model that integrates Multiple Sequence Alignment (MSA) and structural data, building upon the [AIDO.Protein-16B](https://huggingface.co/genbio-ai/AIDO.Protein-16B) foundation. The training process comprises three main stages:
8
 
9
  1. 2D RoPE encoding fine-tuning
10
  2. Initial training on 100 billion tokens from UniRef50/UniClust30 MSA data
 
12
 
13
  ## Model Architecture Details
14
 
15
+ AIDO.Protein-RAG-16B employs a transformer encoder-only architecture featuring sparse Mixture-of-Experts (MoE) layers that replace dense MLP layers in each transformer block. Utilizing single amino acid tokenization and optimized through masked language modeling (MLM), the model activates 2 experts per token via top-2 routing mechanisms.
16
 
17
  <center><img src="proteinmoe_architecture.png" alt="An Overview of AIDO.Protein" style="width:70%; height:auto;" /></center>
18
 
 
29
  | Vocab Size | 44 |
30
  | Context Length | 2048 |
31
 
32
+ ## Pre-training of AIDO.Protein-RAG-16B
33
 
34
+ Here we briefly introduce the details of pre-training of AIDO.Protein-RAG-16B. Mainly divided into three stages: (1) 1D -> 2D RoPE encoding finetuning; (2) UniRef50/Uniclust30 MSA finetuning; (3) AlphaFold Database MSA & Structure tokens finetuning.
35
 
36
  ### Data
37
 
 
202
 
203
  # Citation
204
 
205
+ Please cite AIDO.Protein-RAG-16B using the following BibTex code:
206
 
207
  ```
208
  @inproceedings{sun_mixture_2024,