CommonCanvas Collection Collection of models trained on the CommonCatalogue datasets • 8 items • Updated May 16, 2024 • 11
ElanMT Collection Japanese English Machine Translation trained on openly licensed corpus • 5 items • Updated Nov 29, 2024 • 3
OpenCulture Collection A multilingual dataset of public domain books and newspapers. • 27 items • Updated Nov 6, 2024 • 130
CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images Paper • 2310.16825 • Published Oct 25, 2023 • 36
Tiny Series Collection Tiny datasets that empower the foundation of Small Language Model! • 11 items • Updated Jan 26, 2024 • 38