CodeModernBERT-Finch
This model is a code-specific pretrained model created solely using the CodeSearchNet dataset. It supports six languages included in CodeSearchNet.
For a version fine-tuned specifically for code search tasks, please refer to Shuu12121/CodeSearch-ModernBERT-Finch.
Architecture
- Base: ModernBERT-style encoder
- Hidden size: 512
- Layers: 6
- Attention heads: 6
- Parameters: ~50M
- Pretraining: Masked Language Modeling (MLM)
- Fine-tuning: Domain-specific code tasks
The results below were obtained by randomly sampling 10,000 examples per language from the CodeSearchNet dataset, training them in a Sentence-BERT fashion, and evaluating on the MTEB CodeSearchNetRetrieval benchmark.
All models listed in the table below were fine-tuned using the same approach. Those marked with 200 and the Finch models were trained with a Multiple Negatives Ranking Loss batch size of 200. Others were trained with a batch size of 40 (because larger batches could not fit into memory).
Finch-SmallBatch was trained with a smaller batch size of 40 to create a comparison model against the standard Finch models trained with batch size 200.
Model | go | java | javascript | php | python | ruby |
---|---|---|---|---|---|---|
Finch(40M) | 0.934 | 0.784 | 0.728 | 0.835 | 0.865 | 0.756 |
Finch-Pre(40M) | 0.937 | 0.705 | 0.685 | 0.828 | 0.843 | 0.725 |
Finch-SmallBatch(40M) | 0.930 | 0.765 | 0.707 | 0.825 | 0.859 | 0.748 |
ModernBERT-base-Finetuned(149M) | 0.933 | 0.779 | 0.748 | 0.839 | 0.885 | 0.794 |
Owl-4.1-Small-Fine-tuned(151M) | 0.942 | 0.780 | 0.729 | 0.843 | 0.893 | 0.772 |
Owl-4.1-Small-Fine-tuned-200(151M) | 0.943 | 0.850 | 0.747 | 0.858 | 0.894 | 0.802 |
CodeBERT-Fine-tuned(125M) | 0.932 | 0.708 | 0.709 | 0.828 | 0.870 | 0.772 |
- Downloads last month
- 23