GeistBERT-Longformer

GeistBERT-Longformer is a German language model designed for long-context NLP tasks. It extends GeistBERT by integrating the Longformer self-attention mechanism, allowing for significantly longer sequence processing while maintaining efficiency.

This variant is particularly well-suited for:

Document-level tasks such as legal text analysis, summarization, and passage retrieval.
Tasks requiring extended context windows beyond traditional transformer limits.

Key Features:

Sliding-window attention: Efficient self-attention mechanism that scales to longer sequences.
Extended context length: Allows processing of larger text spans compared to standard BERT/RoBERTa.
Optimized for German: Pre-trained on a for the most part deduplicated German corpus (OSCAR23, OPUS, MC4).

Compared to Nyströmformer and standard RoBERTa, GeistBERT-Longformer requires significantly more VRAM, often necessitating multi-GPU training with gradient accumulation for large batch sizes.

For more details, see GeistBERT on Hugging Face.

GeistBERT
/

GeistBERT_base_longformer

GeistBERT-Longformer

Key Features:

Model tree for GeistBERT/GeistBERT_base_longformer