CodeModernBERT-Finch

This model is a code-specific pretrained model created solely using the CodeSearchNet dataset. It supports six languages included in CodeSearchNet.
For a version fine-tuned specifically for code search tasks, please refer to Shuu12121/CodeSearch-ModernBERT-Finch.

Architecture

Base: ModernBERT-style encoder
Hidden size: 512
Layers: 6
Attention heads: 6
Parameters: ~50M
Pretraining: Masked Language Modeling (MLM)
Fine-tuning: Domain-specific code tasks

The results below were obtained by randomly sampling 10,000 examples per language from the CodeSearchNet dataset, training them in a Sentence-BERT fashion, and evaluating on the MTEB CodeSearchNetRetrieval benchmark. All models listed in the table below were fine-tuned using the same approach. Those marked with 200 and the Finch models were trained with a Multiple Negatives Ranking Loss batch size of 200. Others were trained with a batch size of 40 (because larger batches could not fit into memory).
Finch-SmallBatch was trained with a smaller batch size of 40 to create a comparison model against the standard Finch models trained with batch size 200.

Model	go	java	javascript	php	python	ruby
Finch(40M)	0.934	0.784	0.728	0.835	0.865	0.756
Finch-Pre(40M)	0.937	0.705	0.685	0.828	0.843	0.725
Finch-SmallBatch(40M)	0.930	0.765	0.707	0.825	0.859	0.748
ModernBERT-base-Finetuned(149M)	0.933	0.779	0.748	0.839	0.885	0.794
Owl-4.1-Small-Fine-tuned(151M)	0.942	0.780	0.729	0.843	0.893	0.772
Owl-4.1-Small-Fine-tuned-200(151M)	0.943	0.850	0.747	0.858	0.894	0.802
CodeBERT-Fine-tuned(125M)	0.932	0.708	0.709	0.828	0.870	0.772

Shuu12121
/

CodeModernBERT-Finch

CodeModernBERT-Finch

Architecture

Model tree for Shuu12121/CodeModernBERT-Finch

Dataset used to train Shuu12121/CodeModernBERT-Finch