EuroBERT: Scaling Multilingual Encoders for European Languages Paper • 2503.05500 • Published Mar 7 • 78
SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain Paper • 2407.19584 • Published Jul 28, 2024 • 66
CroissantLLM: A Truly Bilingual French-English Language Model Paper • 2402.00786 • Published Feb 1, 2024 • 27
SaulLM-7B: A pioneering Large Language Model for Law Paper • 2403.03883 • Published Mar 6, 2024 • 83
OpenCulture Collection A multilingual dataset of public domain books and newspapers. • 27 items • Updated Nov 6, 2024 • 125