Base model: westlake-repl/SaProt_650M_AF2

Task type: protein-level regression

Dataset: This dataset contains over 100K mutants derived from the wild type EYFP protein. The number of samples for training, validation and test is 100317, 5969 and 5968. 10% of double-site mutants and 10% of triple-site mutants were used for validation and test respectively, and the remains for training. This model was trained by Jia Zheng's lab at Westlake University. The dataset will be released later by this team.

Model input type: Amino acid sequence

Performance (on test set): 0.94 Spearman's ρ

LoRA config:

  • r: 8
  • lora_dropout: 0.0
  • lora_alpha: 16
  • target_modules: ["query", "key", "value", "intermediate.dense", "output.dense"]
  • modules_to_save: ["classifier"]

Training config:

  • optimizer:
    • class: AdamW
    • betas: (0.9, 0.98)
    • weight_decay: 0.01
  • learning rate: 1e-4
  • epoch: 20
  • batch size: 64
  • precision: 16-mixed
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Collection including SaProtHub/Model-EYFP_100K-650M