metadata
license: mit
datasets:
- msarmi9/korean-english-multitarget-ted-talks-task
language:
- en
- ko
์ง์ ๊ตฌํํ Transformer ๋ฐ RoPE๋ก ์์ด->ํ๊ตญ์ด ๋ฒ์ญ ๋ชจ๋ธ ์ ์
- ์ฝ 13๋ง์์ ์์ด-ํ๊ตญ์ด ๋ฐ์ดํฐ๋ก scratch training.
num_epochs = 5
batch_size = 64
config.intermediate_size = 768*4
config.num_attention_heads = 6
config.num_hidden_layers = 8
