Antibody Humanization Model for Variable Light Chain
This is a RoBERTa model trained from scratch for antibody humanization of Variable Light (VL) chain sequences using Masked Language Modeling (MLM).
Model Description
This model is trained on antibody light chain sequences (variable region) for humanization tasks. It can be used for antibody sequence analysis, humanization, and understanding of VL chain patterns.
Usage
from transformers import RobertaTokenizer, RobertaForMaskedLM
# Load tokenizer and model from Hugging Face
tokenizer = RobertaTokenizer.from_pretrained("hemantn/roberta-base-humAb-vl")
model = RobertaForMaskedLM.from_pretrained("hemantn/roberta-base-humAb-vl")
Using AnthroAb Python Package
For easier antibody humanization, you can use the AnthroAb Python package which provides a high-level interface for antibody humanization tasks. This package is available on PyPI and includes both VH and VL chain models.
Installation
pip install anthroab
Quick Usage
import anthroab
# Humanize a heavy chain sequence (VH)
vh_sequence= "**QLV*SGVEVKKPGASVKVSCKASGYTFTNYYMYWVRQAPGQGLEWMGGINPSNGGTNFNEKFKNRVTLTTDSSTTTAYMELKSLQFDDTAVYYCARRDYRFDMGFDYWGQGTTVTVSS"
humanized_vh = anthroab.predict_masked(vh_sequence, 'H')
print(f"Humanized VH: {humanized_vh}")
# Humanize a light chain sequence (VL)
vl_sequence = "DIQMTQSPSSLSASV*DRVTITCRASQSISSYLNWYQQKPGKAPKLLIYSASTLASGVPSRFSGSGSGTDF*LTISSLQPEDFATYYCQQSYSTPRTFGQGTKVEIK"
humanized_vl = anthroab.predict_masked(vl_sequence, 'L')
print(f"Humanized VL: {humanized_vl}")
Features
- Easy Installation: Install directly from PyPI with
pip install anthroab
- High-Level API: Simple functions for antibody humanization
- Dual Chain Support: Separate models for VH and VL chains
- Sequence Infilling: Fill masked positions with human-like residues
- Mutation Suggestions: Get humanizing mutations for frameworks and CDRs
- Embedding Generation: Create vector representations of antibody sequences
The AnthroAb package uses this RoBERTa model (hemantn/roberta-base-humAb-vl
) for VL chain humanization along with a companion VL model for light chain processing.
Model Details
Architecture
- Model: RoBERTa (trained from scratch)
- Architecture: RobertaForMaskedLM
- Model Type: Masked Language Model for antibody sequences
Specifications
- Hidden Size: 768
- Number of Layers: 12
- Number of Attention Heads: 12
- Intermediate Size: 3072
- Max Position Embeddings: 145
- Vocabulary Size: 25 tokens
- Model Size: ~164 MB
- Downloads last month
- 82
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for hemantn/roberta-base-humAb-vl
Base model
FacebookAI/roberta-base