AbNovoBench: A Comprehensive, Standardized, and Reliable Benchmarking System for Evaluating Monoclonal Antibody De Novo Sequencing Analysis
This repository contains a curated collection of state-of-the-art de novo peptide sequencing models specifically benchmarked for monoclonal antibody (mAb) sequencing from mass spectrometry data. AbNovoBench provides the largest high-quality dataset to date, comprising 1,638,248 peptide-spectrum matches derived from 131 mAbs across six species and 11 proteases, supplemented by eight mAbs with known sequence information for assessing full-length reconstruction.
π Models
This repository includes the following models that have been comprehensively evaluated in our benchmark:
AdaNovo
- Model:
AdaNovo/epoch=2-step=170451.ckpt
- Description: Adaptive de novo peptide sequencing model with enhanced accuracy for complex spectra
- Repository: https://github.com/Westlake-OmicsAI/adanovo_v1
CasaNovo
- Models:
CasaNovoV1/epoch=10-step=600000.ckpt
(V1)CasaNovoV2/epoch=7-step=400000.ckpt
(V2)
- Description: High-throughput de novo peptide sequencing models with improved performance
- Repository: https://github.com/Noble-Lab/casanovo
ContraNovo
- Model:
ContraNovo/ControNovo.ckpt
- Description: Contrastive learning-based de novo peptide sequencing model
- Repository: https://github.com/BEAM-Labs/ContraNovo
DeepNovo
- Model:
DeepNovo/translate.ckpt-283400.*
- Description: Deep learning-based de novo peptide sequencing with attention mechanisms
- Repository: https://github.com/nh2tran/DeepNovo
InstaNovo
- Model:
InstaNovo/epoch=59-step=1700000.ckpt
- Description: Real-time de novo peptide sequencing model optimized for speed and accuracy
- Repository: https://github.com/instadeepai/InstaNovo
PepNet
- Model:
PepNet/model.h5
- Description: Neural network-based peptide sequence prediction model
- Repository: https://github.com/lkytal/pepnet
PGPointNovo
- Models:
PGPointNovo/backward_deepnovo.pth
PGPointNovo/forward_deepnovo.pth
- Description: Point-based graph neural network for de novo peptide sequencing
- Repository: https://github.com/shallFun4Learning/PGPointNovo
pi-HelixNovo
- Model:
pi-HelixNovo/epoch=14-step=800000.ckpt
- Description: Helix-inspired architecture for peptide sequence prediction
- Repository: https://github.com/PHOENIXcenter/pi-HelixNovo
pi-PrimeNovo
- Model:
pi-PrimeNovo/model_massive.ckpt
- Description: Prime-based de novo peptide sequencing model with massive training
- Repository: https://github.com/PHOENIXcenter/pi-HelixNovo
PointNovo
- Models:
PointNovo/backward_deepnovo.pth
PointNovo/forward_deepnovo.pth
- Description: Point cloud-based approach for de novo peptide sequencing
- Repository: https://github.com/irleader/PointNovo
SMSNet
- Model:
SMSNet/translate.ckpt-680000.*
- Description: Sequence-to-sequence model for mass spectrometry-based peptide sequencing
- Repository: https://github.com/cmb-chula/SMSNet
π Usage
For detailed usage instructions, implementation examples, and model-specific documentation, please refer to the original repositories listed above for each model. Each repository contains:
- Installation instructions
- Model loading examples
- Training procedures
- Inference code
- Performance benchmarks
- Dataset information
This collection serves as a centralized repository of pre-trained models for easy access and comparison.
π Benchmark Results
Our comprehensive evaluation of 13 deep learning-based de novo peptide sequencing algorithms across six metric categories revealed:
Peptide Sequencing Performance
- Transformer-based models (ContraNovo, Casanovo V1, and InstaNovo) showed superior performance
- Precision and recall: 0.73β0.79 for amino acids and 0.60β0.67 for peptides
- High efficacy in detecting post-translational modifications
- Excellent generalization across diverse enzymes and species
Assembly Performance
- Template-guided Fusion assembler achieved error-free reconstruction of all chains and complementarity-determining regions (CDRs)
- Superior coverage, accuracy, and gap minimization when using high-quality peptide reads from six algorithms
- Comprehensive evaluation across coverage depth and assembly score metrics
π¬ Research Applications
AbNovoBench is specifically designed for monoclonal antibody research and applications:
- Antibody Discovery: De novo sequencing of monoclonal antibodies from mass spectrometry data
- Therapeutic Development: Characterization of antibody sequences for drug development
- Clinical Diagnostics: Antibody sequencing for diagnostic applications
- Proteomics Research: Standardized benchmarking for antibody-specific algorithm development
π Citation
If you use AbNovoBench in your research, please cite our paper:
@misc{jiang2025abnovobench,
title = {AbNovoBench: A Comprehensive, Standardized, and Reliable Benchmarking System for Evaluating Monoclonal Antibody De Novo Sequencing Analysis},
author = {Wenbin Jiang and Ling Luo and Lihong Huang and Jin Xiao and Zihan Lin and Yijie Qiu and Jiying Wang and Ouyang Hu and Sainan Zhang and Mengsha Tong and Ningshao Xia and Yueting Xiong and Quan Yuan and Rongshan Yu},
year = {2025},
howpublished = {https://github.com/dumbgoos/AbNovoBench}
}
π€ Contributing
We welcome contributions to improve the models or add new ones. Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
π Acknowledgments
We thank the original authors of each model for their contributions to the field of de novo peptide sequencing. This collection represents the collaborative effort of the proteomics community. AbNovoBench is available at https://abnovobench.com and provides a scalable, community-driven platform enriched with an extensive antibody MS data resource to accelerate antibody-specific algorithm development and enhance proteomic reproducibility.
π Contact
For questions or support, please open an issue on this repository or contact the maintainers.
Note: These models are provided for research purposes. Please ensure you have the appropriate licenses and permissions for your specific use case.