seastar105 commited on
Commit
8b63e7f
Β·
verified Β·
1 Parent(s): 479d050

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +56 -0
README.md ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ language:
4
+ - ko
5
+ base_model:
6
+ - openai/whisper-medium
7
+ ---
8
+
9
+ ### Model Description
10
+
11
+ OpenAI의 whisper-medium λͺ¨λΈμ„ μ•„λž˜ λ°μ΄ν„°μ…‹μœΌλ‘œ ν•™μŠ΅ν•œ λͺ¨λΈμž…λ‹ˆλ‹€. μ‚¬μš©μ€‘μΈ ν…ŒμŠ€νŠΈμ…‹ κΈ°μ€€μœΌλ‘œ 평균 μ„±λŠ₯이 whisper-large-v3보닀 μ’‹μŠ΅λ‹ˆλ‹€.
12
+
13
+ - ν•œκ΅­μ–΄ μŒμ„± (https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=123)
14
+ - μ£Όμ†Œ μŒμ„± 데이터 (https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=71556)
15
+ - μ£Όμš” μ˜μ—­λ³„ 회의 μŒμ„±μΈμ‹ 데이터 (https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=464)
16
+ - μ €μŒμ§ˆ 전화망 μŒμ„±μΈμ‹ 데이터 (https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&dataSetSn=571)
17
+ - 방솑 μ½˜ν…μΈ  λŒ€ν™”μ²΄ μŒμ„±μΈμ‹ 데이터 (https://www.aihub.or.kr/aihubdata/data/view.do?dataSetSn=463)
18
+
19
+ Training setup
20
+
21
+ ```
22
+ train_steps: 50000
23
+ warmup_steps: 500
24
+ lr scheduler: linear warmup cosine decay
25
+ max learning rate: 1e-4
26
+ batch size: 1024
27
+ max_grad_norm: 1.0
28
+ adamw_beta1: 0.9
29
+ adamw_beta2: 0.98
30
+ adamw_eps: 1e-6
31
+ ```
32
+
33
+ ### Evaluation
34
+
35
+ https://github.com/rtzr/Awesome-Korean-Speech-Recognition
36
+
37
+ μœ„ λ ˆν¬μ§€ν† λ¦¬μ—μ„œ μ£Όμš” μ˜μ—­λ³„ 회의 μŒμ„±μ„ μ œμ™Έν•œ ν…ŒμŠ€νŠΈμ…‹ κ²°κ³Όμž…λ‹ˆλ‹€. μ•„λž˜ ν…Œμ΄λΈ”μ—μ„œ whisper_medium_komixv2κ°€ λ³Έ λͺ¨λΈ μ„±λŠ₯μž…λ‹ˆλ‹€.
38
+
39
+
40
+ | Model | Average | cv_15_ko | fleurs_ko | kcall_testset | kconf_test | kcounsel_test | klec_testset | kspon_clean | kspon_other |
41
+ |------------------------|---------|----------|-----------|---------------|------------|---------------|--------------|-------------|-------------|
42
+ | whisper_tiny | 36.63 | 31.03 | 18.48 | 58.57 | 36.02 | 33.52 | 35.74 | 42.22 | 37.42 |
43
+ | whisper_base | 40.61 | 22.45 | 15.7 | 85.94 | 41.95 | 32.38 | 39.24 | 46.92 | 40.29 |
44
+ | whisper_small | 17.52 | 11.56 | 6.33 | 30.79 | 18.96 | 13.57 | 18.71 | 22.02 | 18.23 |
45
+ | whisper_medium | 13.92 | 8.2 | 4.38 | 25.73 | 15.66 | 10.1 | 14.9 | 17.16 | 15.22 |
46
+ | whisper_large | 12.77 | 6.83 | 3.9 | 22.68 | 14.35 | 9.2 | 13.89 | 16.78 | 14.56 |
47
+ | whisper_large_v2 | 12.29 | 6.58 | 3.74 | 22.26 | 13.88 | 8.95 | 13.84 | 15.51 | 13.6 |
48
+ | whisper_large_v3 | 7.99 | 5.11 | 3.72 | 5.45 | 9.35 | 3.83 | 8.46 | 15.08 | 12.89 |
49
+ | whisper_large_v3_turbo | 10.75 | 5.38 | 3.99 | 10.93 | 10.27 | 4.21 | 9.42 | 26.66 | 15.16 |
50
+ | whisper_base_komixv2 | 8.73 | 10.27 | 5.14 | 6.23 | 10.86 | 7.01 | 10.38 | 9.98 | 9.99 |
51
+ | whisper_small_komixv2 | 7.36 | 7.07 | 4.19 | 5.6 | 9.67 | 5.5 | 8.55 | 9.26 | 9.07 |
52
+ | whisper_medium_komixv2 | 7.3 | 6.62 | 4.52 | 5.85 | 9.42 | 5.47 | 8.38 | 9.19 | 8.97 |
53
+
54
+ ### Acknowledgement
55
+ - λ³Έ λͺ¨λΈμ€ κ΅¬κΈ€μ˜ TRC ν”„λ‘œκ·Έλž¨μ˜ μ§€μ›μœΌλ‘œ ν•™μŠ΅ν–ˆμŠ΅λ‹ˆλ‹€.
56
+ - Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC)