Model Description

OpenAI์˜ whisper-base ๋ชจ๋ธ์„ ์•„๋ž˜ ์„ธ๊ฐ€์ง€ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šตํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

train_steps: 20000
warmup_steps: 2000
lr scheduler: linear warmup cosine decay
max learning rate: 1e-4
batch size: 256
max_grad_norm: 1.0
adamw_beta1: 0.9
adamw_beta2: 0.98

Evaluation

https://github.com/rtzr/Awesome-Korean-Speech-Recognition

์œ„ ๋ ˆํฌ์ง€ํ† ๋ฆฌ์—์„œ ์ฃผ์š” ์˜์—ญ๋ณ„ ํšŒ์˜ ์Œ์„ฑ์„ ์ œ์™ธํ•œ ํ…Œ์ŠคํŠธ์…‹ ๊ฒฐ๊ณผ์ž…๋‹ˆ๋‹ค. ์•„๋ž˜ ํ…Œ์ด๋ธ”์—์„œ whisper_base_komix๊ฐ€ ๋ณธ ๋ชจ๋ธ ์„ฑ๋Šฅ์ž…๋‹ˆ๋‹ค.

Model cv_15_ko fleurs_ko kcall_testset kconf_test kcounsel_test klec_testset kspon_clean kspon_other
whisper_base 21.16 11.89 42.56 27.62 22.24 28.65 30.41 27.02
whisper_base_komix 15.42 7.16 20.86 14.24 12.64 13.44 12.26 12.12
whisper_large_v3 5.11 3.72 5.45 9.35 3.83 8.46 15.08 12.89
whisper_large_v3_turbo 5.38 3.95 5.89 9.77 4.21 9.27 16.49 13.54
Downloads last month
16
Safetensors
Model size
72.6M params
Tensor type
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.