Model Description

OpenAI์˜ whisper-base ๋ชจ๋ธ์„ ์•„๋ž˜ ์„ธ๊ฐ€์ง€ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šตํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

train_steps: 20000
warmup_steps: 2000
lr scheduler: linear warmup cosine decay
max learning rate: 1e-4
batch size: 256
max_grad_norm: 1.0
adamw_beta1: 0.9
adamw_beta2: 0.98

Evaluation

https://github.com/rtzr/Awesome-Korean-Speech-Recognition

์œ„ ๋ ˆํฌ์ง€ํ† ๋ฆฌ์—์„œ ์ฃผ์š” ์˜์—ญ๋ณ„ ํšŒ์˜ ์Œ์„ฑ์„ ์ œ์™ธํ•œ ํ…Œ์ŠคํŠธ์…‹ ๊ฒฐ๊ณผ์ž…๋‹ˆ๋‹ค. ์•„๋ž˜ ํ…Œ์ด๋ธ”์—์„œ whisper_base_komix๊ฐ€ ๋ณธ ๋ชจ๋ธ ์„ฑ๋Šฅ์ž…๋‹ˆ๋‹ค.

Model cv_15_ko fleurs_ko kcall_testset kconf_test kcounsel_test klec_testset kspon_clean kspon_other
whisper_base 21.16 11.89 42.56 27.62 22.24 28.65 30.41 27.02
whisper_base_komix 15.42 7.16 20.86 14.24 12.64 13.44 12.26 12.12
whisper_large_v3 5.11 3.72 5.45 9.35 3.83 8.46 15.08 12.89
whisper_large_v3_turbo 5.38 3.95 5.89 9.77 4.21 9.27 16.49 13.54
Downloads last month
8
Safetensors
Model size
72.6M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support