TensorBoard
Safetensors
English
German
t5

Occiglot5

Occiglot5

Occiglot5 is a modern T5 model for German with 1.42B parameters and the following features:

  • Pretrained on the German Occiglot FineWeb corpus (except deWaC and Open Legal Data) and on the 10BT subsets of FineWeb and FineWeb-Edu
  • UL2 is used as pretraining objective
  • Efficient T5 architecture from the "Scale Efficiently" paper is used
  • Pretrained for 5M steps using a batch size of 128 and an input/output sequence length of 512
  • One-shot training on a v4-32 TPU Pod for 22.3 days without any crashes

Acknowledgments

Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC). Many Thanks for providing access to the TPUs over many years ❤️

Made from Bavarian Oberland with ❤️ and 🥨.

Downloads last month
8
Safetensors
Model size
1.42B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train GermanT5/occiglot5