SciLaD: A Large-Scale, Transparent, Reproducible Dataset for Natural Scientific Language Processing
Paper
• 2512.11192 • Published
NLP, Digital Humanities
Gaperon: A Peppered English-French Generative Language Model Suite
LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens