--- datasets: - togethercomputer/RedPajama-Data-V2 language: - de tags: - fill-mask - masked-lm - long-context - modernbert library_name: transformers license: other --- # ModernGBERT 134M This is a German ModernBERT 134M language model trained from scratch using the ModernBERT [codebase](https://github.com/AnswerDotAI/ModernBERT) and the same German portion of [RedPajama V2](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-V2) as our [LLäMmlein](https://huggingface.co/collections/LSX-UniWue/llammlein-6732ff41f3705c686e605762) family. Find more details in our [preprint](https://arxiv.org/abs/2505.13136)! ### Usage ```python from transformers import AutoModelForMaskedLM, AutoTokenizer model = AutoModelForMaskedLM.from_pretrained("LSX-UniWue/ModernGBERT_134M") tokenizer = AutoTokenizer.from_pretrained("LSX-UniWue/ModernGBERT_134M") ``` ### Performance We evaluated our model on the [SuperGLEBer](https://lsx-uniwue.github.io/SuperGLEBer-site/) benchmark.