---
datasets:
- togethercomputer/RedPajama-Data-V2
language:
- de
tags:
  - fill-mask
  - masked-lm
  - long-context
  - modernbert
library_name: transformers
license: other
---

# ModernGBERT 134M

This is a German ModernBERT 134M language model trained from scratch using the ModernBERT [codebase](https://github.com/AnswerDotAI/ModernBERT) and the same German portion of [RedPajama V2](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-V2) as our [LLäMmlein](https://huggingface.co/collections/LSX-UniWue/llammlein-6732ff41f3705c686e605762) family. 
Find more details in our [preprint](https://arxiv.org/abs/2505.13136)!

### Usage

```python
from transformers import AutoModelForMaskedLM, AutoTokenizer

model = AutoModelForMaskedLM.from_pretrained("LSX-UniWue/ModernGBERT_134M")

tokenizer = AutoTokenizer.from_pretrained("LSX-UniWue/ModernGBERT_134M")
```


### Performance 
We evaluated our model on the [SuperGLEBer](https://lsx-uniwue.github.io/SuperGLEBer-site/) benchmark.