🦉CodeModernBERT-Owl-4.1
CodeModernBERT-Owl-4.1 is pre-trained version of the multilingual long-context encoder model in the CodeModernBERT series. It is optimized for downstream code-related tasks such as code search, code summarization, bug repair, and representation learning.
This model is built upon the pretraining checkpoint CodeModernBERT-Owl-4.1-Pre and was further pre-trained to better capture structural patterns and semantics in source code across multiple programming languages.
🚀 Model Highlights
- 2048-token context window for long code understanding
- Trained on 9.9M functions in 8 programming languages
- Fine-tuned for downstream usability
- Ideal for code search, semantic embedding, summarization, and cloze-style bug repair
- Multilingual support: Python, JavaScript, Java, TypeScript, PHP, Go, Ruby, and Rust
Architecture
- Base: ModernBERT-style encoder
- Hidden size: 768
- Layers: 12
- Attention heads: 12
- Parameters: ~150M
- Pretraining: Masked Language Modeling (MLM)
- Fine-tuning: Domain-specific code tasks
🧪 Usage (Hugging Face Transformers)
from transformers import AutoTokenizer, AutoModel
import torch
tokenizer = AutoTokenizer.from_pretrained("Shuu12121/CodeModernBERT-Owl-4.1")
model = AutoModel.from_pretrained("Shuu12121/CodeModernBERT-Owl-4.1")
code = "def factorial(n):\n if n <= 1:\n return 1\n return n * factorial(n - 1)"
inputs = tokenizer(code, return_tensors="pt", padding=True, truncation=True)
outputs = model(**inputs)
# Mean Pooling
def mean_pooling(model_output, attention_mask):
token_embeddings = model_output.last_hidden_state
input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
embeddings = mean_pooling(outputs, inputs['attention_mask'])
- Downloads last month
- 102
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support