Edit model card

tinyllama-proteinpretrain-quinoa

Full model finetuning of TinyLLaMA-1.1B on the "research" split (quinoa protein sequences) of GreenBeing-Proteins dataset.

Notes: pretraining only on sequences leads the model to only generate protein sequences, eventually repeating VVVV ot KKKK.

  • This model may be replaced with mixed training (bio/chem text and protein).
  • This model might need "biotokens" to represent the amino acids instead of using the existing tokenizer.

More details TBD

Downloads last month
15
Safetensors
Model size
1.1B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for monsoon-nlp/tinyllama-proteinpretrain-quinoa

Finetuned
(87)
this model

Datasets used to train monsoon-nlp/tinyllama-proteinpretrain-quinoa