GeistBERT-Longformer
GeistBERT-Longformer is a German language model designed for long-context NLP tasks. It extends GeistBERT by integrating the Longformer self-attention mechanism, allowing for significantly longer sequence processing while maintaining efficiency.
This variant is particularly well-suited for:
- Document-level tasks such as legal text analysis, summarization, and passage retrieval.
- Tasks requiring extended context windows beyond traditional transformer limits.
Key Features:
- Sliding-window attention: Efficient self-attention mechanism that scales to longer sequences.
- Extended context length: Allows processing of larger text spans compared to standard BERT/RoBERTa.
- Optimized for German: Pre-trained on a for the most part deduplicated German corpus (OSCAR23, OPUS, MC4).
Compared to Nyströmformer and standard RoBERTa, GeistBERT-Longformer requires significantly more VRAM, often necessitating multi-GPU training with gradient accumulation for large batch sizes.
For more details, see GeistBERT on Hugging Face.
- Downloads last month
- 869
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for GeistBERT/GeistBERT_base_longformer
Base model
TUM/GottBERT_filtered_base_best
Finetuned
GeistBERT/GeistBERT_base