Fully generalist synthetic dataset and SOTA small reasoners
AI & ML interests
Open Science LLMs
Recent Activity
Organization Card
PleIAs is a French private AI Lab training the next generation of Language Models for document processing.
PleIAs is committed to open science and has coordinated the release of some of the largest open corpus for pre-training.
For more information, visit our website : https://pleias.fr/
Contact us : contact@pleias.fr
spaces 7
Runtime error
3
baguettotron_demo
📜
Runtime error
4
Vintage OCR Corrector (GPU)
📜
Correct OCR errors in your text
Runtime error
7
Vintage OCR Corrector (CPU)
📜
Correct OCR errors in text
Build error
9
Finance Commons Explorer
💻
Browse finance datasets on Hugging Face
Runtime error
9
Reversed-Zotero
📜
models 29
PleIAs/Monad
Text Generation • 56.7M • Updated
• 1.93k • 67
PleIAs/Baguettotron
Text Generation • 0.3B • Updated
• 4.88k • 240
PleIAs/Baguettotron-GGUF
0.3B • Updated
• 1.49k • 10
PleIAs/celadon
Text Classification • 0.1B • Updated
• 63 • 36
PleIAs/OCRerrcr
Token Classification • 0.4B • Updated
• 5 • 14
PleIAs/ksante-colbert-small
33.4M • Updated
• 23 • 1
PleIAs/Pleias-RAG-350M
Text Generation • 0.4B • Updated
• 246 • 31
PleIAs/Pleias-RAG-1B
Text Generation • Updated
• 172 • 66
PleIAs/Pleias-RAG-1B-gguf
1B • Updated
• 165 • 11
PleIAs/Pleias-RAG-350M-gguf
0.4B • Updated
• 48 • 4
datasets 56
PleIAs/BSF_Redline
Viewer
• Updated
• 1.05M • 29
PleIAs/common_corpus
Viewer
• Updated
• 69.9k • 134k • 386
PleIAs/Japanese-PD
Viewer
• Updated
• 1.38M • 178
PleIAs/Arabic-PD
Viewer
• Updated
• 221k • 76
PleIAs/verse-wikisource
Preview
• Updated
• 26 • 2
PleIAs/SYNTH
Viewer
• Updated
• 68M • 92.9k • 253
PleIAs/Youtube-Commons-Audio-Sample-1000
Updated
• 12
PleIAs/gpt-oss20b-samples-dedup
Viewer
• Updated
• 179k • 120 • 5
PleIAs/Post-OCR-Correction
Viewer
• Updated
• 50.4k • 1.03k • 135
PleIAs/GoldenSwag
Viewer
• Updated
• 1.53k • 1.66k • 5