🦉CodeModernBERT-Owl-4.1

CodeModernBERT-Owl-4.1 is pre-trained version of the multilingual long-context encoder model in the CodeModernBERT series. It is optimized for downstream code-related tasks such as code search, code summarization, bug repair, and representation learning.

This model is built upon the pretraining checkpoint CodeModernBERT-Owl-4.1-Pre and was further pre-trained to better capture structural patterns and semantics in source code across multiple programming languages.


🚀 Model Highlights

  • 2048-token context window for long code understanding
  • Trained on 9.9M functions in 8 programming languages
  • Fine-tuned for downstream usability
  • Ideal for code search, semantic embedding, summarization, and cloze-style bug repair
  • Multilingual support: Python, JavaScript, Java, TypeScript, PHP, Go, Ruby, and Rust

Architecture

  • Base: ModernBERT-style encoder
  • Hidden size: 768
  • Layers: 12
  • Attention heads: 12
  • Parameters: ~150M
  • Pretraining: Masked Language Modeling (MLM)
  • Fine-tuning: Domain-specific code tasks

🧪 Usage (Hugging Face Transformers)

from transformers import AutoTokenizer, AutoModel
import torch

tokenizer = AutoTokenizer.from_pretrained("Shuu12121/CodeModernBERT-Owl-4.1")
model = AutoModel.from_pretrained("Shuu12121/CodeModernBERT-Owl-4.1")

code = "def factorial(n):\n    if n <= 1:\n        return 1\n    return n * factorial(n - 1)"
inputs = tokenizer(code, return_tensors="pt", padding=True, truncation=True)
outputs = model(**inputs)

# Mean Pooling
def mean_pooling(model_output, attention_mask):
    token_embeddings = model_output.last_hidden_state
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)

embeddings = mean_pooling(outputs, inputs['attention_mask'])
Downloads last month
102
Safetensors
Model size
152M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Shuu12121/CodeModernBERT-Owl-4.1

Finetunes
4 models

Datasets used to train Shuu12121/CodeModernBERT-Owl-4.1