Korean data rate in pretraining datasets.
#78
by
Korabbit
- opened
When I looked up the paper , there was no report on the percentage of Korean data.
What is the percentage of Korean data?
i have same question as you, they said that they outperforms Llama 2 13B on all benchmarks, but their model seem not support korean or vietnamese language
@RoiandDae No, I can't find this answer.