Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
kirch
's Collections
Scotch & SOTA 🥃 Pt. 1: Big Boi LLM 🚛
Scotch & SOTA 🥃 Pt. 2: Quantized Small Boi LLM 👉👈
Scotch & SOTA 🥃 Pt. 3: Image Sorcery 🔮
Scotch & SOTA 🥃 Pt. 4: Pre-Training Datasets 📜
Scotch & SOTA 🥃 Pt. 5: Instruction Tuning Datasets 👩🏫
Scotch & SOTA 🥃 Pt. 6: Dialogue Tuning Datasets 💬
Scotch & SOTA 🥃 Pt. 7: Human Feedback Datasets 🫣
Scotch & SOTA 🥃 Pt. 4: Multi-Modal 🔀
Scotch & SOTA 🥃 Pt. 4: Pre-Training Datasets 📜
updated
Sep 25, 2023
We gotta start somewhere, these jsonl's aren't gonna train themselves.
Upvote
-
allenai/dolma
Updated
Apr 17, 2024
•
778
•
901
allenai/peS2o
Updated
Oct 13, 2024
•
2.34k
•
171
tiiuae/falcon-refinedweb
Viewer
•
Updated
Jun 20, 2023
•
968M
•
29.3k
•
846
CarperAI/pilev2-dev
Preview
•
Updated
Mar 13, 2023
•
11
•
25
bigcode/the-stack
Viewer
•
Updated
Apr 13, 2023
•
546M
•
7.44k
•
792
bigcode/starcoderdata
Viewer
•
Updated
May 16, 2023
•
207M
•
7.86k
•
429
cerebras/SlimPajama-627B
Preview
•
Updated
Jul 7, 2023
•
33.9k
•
460
euirim/goodwiki
Viewer
•
Updated
Sep 11, 2023
•
44.8k
•
75
•
52
nampdn-ai/tiny-textbooks
Viewer
•
Updated
Jul 3, 2024
•
420k
•
384
•
152
nampdn-ai/tiny-codes
Viewer
•
Updated
Sep 30, 2023
•
1.63M
•
307
•
248
roneneldan/TinyStories
Viewer
•
Updated
Aug 12, 2024
•
2.14M
•
40.8k
•
646
nampdn-ai/tiny-bridgedict
Viewer
•
Updated
Aug 4, 2023
•
17.6k
•
15
•
17
nampdn-ai/tiny-webtext
Viewer
•
Updated
Aug 27, 2023
•
2.32M
•
89
•
32
Upvote
-
Share collection
View history
Collection guide
Browse collections