--- tags: - sentence-transformers - sentence-similarity - code - python - php - javascript - ruby - rust - go - java base_model: Shuu12121/CodeModernBERT-Owl-1.0 pipeline_tag: sentence-similarity library_name: sentence-transformers license: apache-2.0 language: - en datasets: - Shuu12121/python-codesearch-filtered - Shuu12121/java-codesearch-filtered - Shuu12121/javascript-codesearch-filtered - Shuu12121/go-codesearch-filtered - Shuu12121/php-codesearch-filtered - Shuu12121/ruby-codesearch-filtered - Shuu12121/rust-codesearch-filtered - code-search-net/code_search_net --- # **๐Ÿฆ‰ CodeSearch-ModernBERT-Owl-Plus: High-Performance Sentence-BERT for Code Search** **CodeSearch-ModernBERT-Owl-Plus** is a high-performance code search model fine-tuned in a Sentence-BERT architecture, based on the pretrained **CodeModernBERT-Owl v1.0**. This model is optimized for function-level search within codebases and natural language queries, achieving state-of-the-art results on the MTEB benchmark. --- # **๐Ÿ›  Features** * โœ… Fine-tuned in Sentence-BERT format from CodeModernBERT-Owl * โœ… Supports multiple languages (Python, Java, JavaScript, etc.) * โœ… Specialized encoder for high-accuracy code search * โœ… Ideal for multi-stage (dual encoder) retrieval setups * โœ… Generates rich semantic embeddings for code and queries --- # **๐Ÿ“Š Evaluation on MTEB Benchmark** ## **๐Ÿ† Main Scores in MTEB** This model achieved the following **main scores** (based on NDCG\@10): * **CodeSearchNetRetrieval**: `main_score = 0.8918` * **COIR-CodeSearchNetRetrieval**: `main_score = 0.8013` --- ### ๐Ÿงช **CodeSearchNetRetrieval (MTEB)** | Metric | Score | | ------------- | ---------- | | **MRR\@10** | **0.8704** | | **NDCG\@10** | 0.8918 | | MAP\@10 | 0.8704 | | Recall\@10 | 0.9563 | | Precision\@10 | 0.0956 | This model achieves strong performance across all ranking metrics and demonstrates balanced retrieval capability. --- ### ๐Ÿงช **COIR-CodeSearchNetRetrieval (MTEB)** | Metric | Score | | ------------- | ---------- | | **MRR\@10** | **0.7751** | | **NDCG\@10** | 0.8013 | | MAP\@10 | 0.7751 | | Recall\@10 | 0.8826 | | Precision\@10 | 0.0883 | Robust and consistent performance is also maintained on the COIR dataset, demonstrating strong generalization. --- # **๐Ÿ“ฅ Usage Example** ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer("Shuu12121/CodeSearch-ModernBERT-Owl-Plus") embeddings = model.encode(["binary search function", "def binary_search(arr, target): ..."]) ``` --- # **๐Ÿ“ Conclusion** * โœ… An optimized Sentence-BERT model based on CodeModernBERT-Owl * โœ… Achieves MRR\@10 > 0.87 on MTEB CodeSearchNetRetrieval * โœ… Ready for integration in production-level code search systems --- # **๐Ÿ“œ License** ๐Ÿ“„ Apache-2.0 # **๐Ÿ“ง Contact** For questions or inquiries, feel free to reach out: ๐Ÿ“ง **[shun0212114@outlook.jp](mailto:shun0212114@outlook.jp)**