Common Pile

community
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

storytracer  updated a Space about 2 hours ago
common-pile/README
storytracer  updated a dataset about 3 hours ago
common-pile/youtube_filtered
storytracer  updated a dataset about 3 hours ago
common-pile/youtube
View all activity

The Common Pile

We are a group of researchers working together to collect and curate openly licensed and public domain data for training large language models. So far, we have released:

If you're interested in contributing, please open an issue on GitHub!