An LLM pre-training dataset containing only public domain and openly licensed text
Nikhil Kandpal
nkandpa2
AI & ML interests
None yet
Recent Activity
upvoted
an
article
23 days ago
Announcing the Common Pile and Comma v0.1
updated
a dataset
27 days ago
common-pile/stackv2_edu_filtered
updated
a dataset
27 days ago
common-pile/youtube_filtered