CodeModernBERT-Finch

This model is a code-specific pretrained model created solely using the CodeSearchNet dataset. It supports six languages included in CodeSearchNet.
For a version fine-tuned specifically for code search tasks, please refer to Shuu12121/CodeSearch-ModernBERT-Finch.

Architecture

  • Base: ModernBERT-style encoder
  • Hidden size: 512
  • Layers: 6
  • Attention heads: 6
  • Parameters: ~50M
  • Pretraining: Masked Language Modeling (MLM)
  • Fine-tuning: Domain-specific code tasks

The results below were obtained by randomly sampling 10,000 examples per language from the CodeSearchNet dataset, training them in a Sentence-BERT fashion, and evaluating on the MTEB CodeSearchNetRetrieval benchmark. All models listed in the table below were fine-tuned using the same approach. Those marked with 200 and the Finch models were trained with a Multiple Negatives Ranking Loss batch size of 200. Others were trained with a batch size of 40 (because larger batches could not fit into memory).
Finch-SmallBatch was trained with a smaller batch size of 40 to create a comparison model against the standard Finch models trained with batch size 200.

Model go java javascript php python ruby
Finch(40M) 0.934 0.784 0.728 0.835 0.865 0.756
Finch-Pre(40M) 0.937 0.705 0.685 0.828 0.843 0.725
Finch-SmallBatch(40M) 0.930 0.765 0.707 0.825 0.859 0.748
ModernBERT-base-Finetuned(149M) 0.933 0.779 0.748 0.839 0.885 0.794
Owl-4.1-Small-Fine-tuned(151M) 0.942 0.780 0.729 0.843 0.893 0.772
Owl-4.1-Small-Fine-tuned-200(151M) 0.943 0.850 0.747 0.858 0.894 0.802
CodeBERT-Fine-tuned(125M) 0.932 0.708 0.709 0.828 0.870 0.772

Downloads last month
23
Safetensors
Model size
40.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Shuu12121/CodeModernBERT-Finch

Finetunes
3 models

Dataset used to train Shuu12121/CodeModernBERT-Finch