Georgios Smyrnis
gsmyrnis
AI & ML interests
None yet
Recent Activity
updated
a dataset
about 15 hours ago
mlfoundations-dev/b2_code_fasttext_pos_ioi_neg_sql
published
a dataset
about 15 hours ago
mlfoundations-dev/b2_code_fasttext_pos_ioi_neg_sql
updated
a dataset
about 15 hours ago
mlfoundations-dev/b2_code_fasttext_pos_code_golf_neg_sql
Organizations
gsmyrnis's activity
Any rundown on the data sources?
2
5
#2 opened 20 days ago
by
teknium

Update config.json
1
#4 opened 8 months ago
by
sedrickkeh
TypeError: Couldn't cast array of type
1
#1 opened 8 months ago
by
shizhediao2

Seems like WARC metadata is missing from this version?
1
#4 opened 8 months ago
by
yury-zyphra
Missing files
3
#2 opened 10 months ago
by
pengyuan

Were the documents shuffled before the dataset was split into shards?
3
#5 opened 10 months ago
by
yury-zyphra
Would you share the 0.28T token dataset for achieve highest scores in 7B-2x experiment?
2
#6 opened 10 months ago
by
Mars2050
How many rows are there in the dataset?
1
#4 opened 10 months ago
by
yury-zyphra
Reproduce the clip score
1
#1 opened over 1 year ago
by
zhangjc404