๊ตฌ์์ฅ์ ํ์๋ฅผ ์ํ ์์ฑ์ธ์ ๋ชจ๋ธ
ํ๋ก์ ํธ ์ ๋ณด
์ฌ๋จ๋ฒ์ธ ๋ฏธ๋์ ์ํํธ์จ์ด์ ํจ๊ปํ๋ ์ 3ํ์์ด๋์ด ๊ณต๋ชจ์
ํ๋ก์ ํธ ๋ช
"๊ตฌ์์ฅ์ ์์ฑ ๋ฐ์ดํฐ๋ฅผ ํ์ฉํ ๊ณ ๋ น ํ์์ ์์ฌ์ํต ๊ฐ์ ๋ฐฉ์"
๋ชจ๋ธ ์ค๋ช
- openai/whisper-large-v3์ ๋ํ ํ์ธํ๋ ๋ชจ๋ธ
- ๋ณธ ๋ชจ๋ธ์ "๊ตฌ์์ฅ์ ์์ฑ ๋ฐ์ดํฐ๋ฅผ ํ์ฉํ ๊ณ ๋ น ํ์์ ์์ฌ์ํต ๊ฐ์ ๋ฐฉ์" ํ๋ก์ ํธ์ ๊ตฌ์์ฅ์ ํ์๋ค์ ๋ํ ํ๊ตญ์ด ์์ฑ์ธ์ ๋ชจ๋ธ์. OpenAI์ Whisper ๋ชจ๋ธ์ ํ์ธํ๋ ํ์ฌ ๊ตฌ์์ฅ์ ์ ์์ฑ์ ํน์ฑ์ ๋ฐ์ํ ๋ชจ๋ธ์ ๊ตฌ์ถํ์์.
- ์ค๋ฅธ์ชฝ "Inference API"๋ฅผ ํตํด ์์ฑ์ธ์ ๋ชจ๋ธ์ ํ ์คํธ ํด๋ณผ ์ ์์ต๋๋ค.
ํ์ต ๋ชจ๋ธ
- Paper: Radford, A., Kim, J. W., Xu, T., Brockman, G., McLeavey, C., & Sutskever, I. (2023, July). Robust speech recognition via large-scale weak supervision. In International Conference on Machine Learning (pp. 28492-28518). PMLR.
- URL: https://proceedings.mlr.press/v202/radford23a.html
ํ์ต ๋ฐ์ดํฐ
- AIHub "๊ตฌ์์ฅ์ ์์ฑ ๋ฐ์ดํฐ" (KOR)
- URL: https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=608
ํ์ต ํ๋ผ๋ฏธํฐ
- learning_rate: 5e-07
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 10
- mixed_precision_training: Native AMP
ํ์ต ๊ฒฐ๊ณผ
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
4.2932 | 0.09 | 10 | 4.6306 | 16.0442 |
4.2744 | 0.18 | 20 | 4.1942 | 16.2348 |
3.7418 | 0.27 | 30 | 3.7625 | 15.5107 |
3.2037 | 0.36 | 40 | 3.5635 | 14.6723 |
3.4714 | 0.45 | 50 | 3.4383 | 14.3674 |
2.8962 | 0.55 | 60 | 3.3494 | 14.1768 |
2.7958 | 0.64 | 70 | 3.2752 | 18.2927 |
2.8691 | 0.73 | 80 | 3.2208 | 19.5884 |
2.8693 | 0.82 | 90 | 3.1857 | 20.6174 |
2.9474 | 0.91 | 100 | 3.1644 | 20.6555 |
3.1712 | 1.0 | 110 | 3.1551 | 20.6174 |
Framework versions
- Transformers 4.38.0.dev0
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.1
- Downloads last month
- 38
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for RecCode/whisper_final
Base model
openai/whisper-large-v3