Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
heegyu
's Collections
SWE Agent
R1-like Datasets
Korean Reward Modeling
Korean Pretraining Dataset
AjouBlue GPTs
Datasets Translated to Korean
Synthetic Dataset
RLHF papers
Reward Modeling Datasets
Pre-training Dataset
Vision LM
Image Generation
Domain Specific (Math, Code, etc)
Machine Translation
Safety LM
Text2SQL
Korean Pretraining Dataset
updated
Nov 19, 2024
Upvote
12
+2
heegyu/namuwiki-extracted
Viewer
•
Updated
Jan 15, 2023
•
565k
•
180
•
21
heegyu/kowikitext
Viewer
•
Updated
Oct 2, 2022
•
1.33M
•
614
•
6
maywell/korean_textbooks
Viewer
•
Updated
Jan 10, 2024
•
4.42M
•
1.33k
•
116
heegyu/korean-petitions
Viewer
•
Updated
Jan 15, 2023
•
437k
•
793
•
13
hac541309/basic_korean_dict
Viewer
•
Updated
Jul 26, 2023
•
74.9k
•
50
•
6
lcw99/oscar-ko-only
Viewer
•
Updated
Oct 21, 2022
•
3.68M
•
89
•
3
uonlp/CulturaX
Viewer
•
Updated
Dec 16, 2024
•
7.18B
•
15.6k
•
527
Note
mC4 + OSCAR +
lbox/lbox_open
Updated
Apr 11
•
1.06k
•
15
HAERAE-HUB/KOREAN-WEBTEXT
Viewer
•
Updated
May 31, 2024
•
1.28M
•
397
•
36
HAERAE-HUB/KOREAN-SyntheticText-1.5B
Viewer
•
Updated
Jul 22, 2024
•
1.55M
•
153
•
14
Upvote
12
+8
Share collection
View history
Collection guide
Browse collections