--- license: apache-2.0 language: - ko base_model: - LGAI-EXAONE/EXAONE-4.0-1.2B tags: - speech-to-text - korean - audio - voice - bigdefence - EXAONE - LG pipeline_tag: audio-text-to-text --- ## 🎧 Bigvox - **Bigvox**은 ν•œκ΅­μ–΄ μŒμ„± 인식에 νŠΉν™”λœ κ³ μ„±λŠ₯, μ €μ§€μ—° μŒμ„± μ–Έμ–΄ λ©€ν‹°λͺ¨λ‹¬ λͺ¨λΈμž…λ‹ˆλ‹€. [LGAI-EXAONE/EXAONE-4.0-1.2B](https://huggingface.co/LGAI-EXAONE/EXAONE-4.0-1.2B) 기반으둜 κ΅¬μΆ•λ˜μ—ˆμŠ΅λ‹ˆλ‹€. πŸš€ - **End-to-End** μŒμ„± λ©€ν‹°λͺ¨λ‹¬ ꡬ쑰λ₯Ό μ±„νƒν•˜μ—¬ μŒμ„± μž…λ ₯λΆ€ν„° ν…μŠ€νŠΈ 좜λ ₯κΉŒμ§€ ν•˜λ‚˜μ˜ νŒŒμ΄ν”„λΌμΈμ—μ„œ μ²˜λ¦¬ν•˜λ©°, 좔가적인 쀑간 λͺ¨λΈ 없이 μžμ—°μŠ€λŸ½κ²Œ λ©€ν‹°λͺ¨λ‹¬ 처리λ₯Ό μ§€μ›ν•©λ‹ˆλ‹€. ![image/png](https://cdn-uploads.huggingface.co/production/uploads/653494138bde2fae198fe89e/d7YWXLrfOnuVSjn_ndIon.png) ### πŸ“‚ λͺ¨λΈ μ ‘κ·Ό - **GitHub**: [bigdefence/bigvox-exaone](https://github.com/bigdefence/bigvox-exaone) 🌐 - **HuggingFace**: [bigdefence/Bigvox-Exaone4-Audio](https://huggingface.co/bigdefence/Bigvox-Exaone4-Audio) πŸ€— - **λͺ¨λΈ 크기**: 2B νŒŒλΌλ―Έν„° πŸ“Š ## 🌟 μ£Όμš” νŠΉμ§• - **πŸ‡°πŸ‡· ν•œκ΅­μ–΄ νŠΉν™”**: ν•œκ΅­μ–΄ μŒμ„± νŒ¨ν„΄κ³Ό 언어적 νŠΉμ„±μ— μ΅œμ ν™” - **⚑ κ²½λŸ‰ν™”**: 2B νŒŒλΌλ―Έν„°λ‘œ 효율적인 μΆ”λ‘  μ„±λŠ₯ - **🎯 고정확도**: λ‹€μ–‘ν•œ ν•œκ΅­μ–΄ μŒμ„± ν™˜κ²½μ—μ„œ μš°μˆ˜ν•œ μ„±λŠ₯ - **πŸ”§ μ‹€μš©μ„±**: μ‹€μ‹œκ°„ μŒμ„± 인식 μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ— 적합 ## πŸ“‹ λͺ¨λΈ 정보 | ν•­λͺ© | 세뢀사항 | |------|----------| | **기반 λͺ¨λΈ** | LGAI-EXAONE/EXAONE-4.0-1.2B | | **μ–Έμ–΄** | ν•œκ΅­μ–΄ (Korean) | | **λͺ¨λΈ 크기** | ~2B νŒŒλΌλ―Έν„° | | **μž‘μ—… μœ ν˜•** | Speech-to-Text μŒμ„± λ©€ν‹°λͺ¨λ‹¬ | | **λΌμ΄μ„ μŠ€** | Apache 2.0 | ### πŸ”§ λ ˆν¬μ§€ν† λ¦¬ λ‹€μš΄λ‘œλ“œ 및 ν™˜κ²½ μ„€μ • **Bigvox**을 μ‹œμž‘ν•˜λ €λ©΄ λ‹€μŒκ³Ό 같이 λ ˆν¬μ§€ν† λ¦¬λ₯Ό ν΄λ‘ ν•˜κ³  ν™˜κ²½μ„ μ„€μ •ν•˜μ„Έμš”. πŸ› οΈ 1. **λ ˆν¬μ§€ν† λ¦¬ 클둠**: ```bash git clone https://github.com/bigdefence/bigvox-exaone cd bigvox-exaone ``` 2. **μ˜μ‘΄μ„± μ„€μΉ˜**: ```bash bash setting.sh ``` ### πŸ“₯ λ‹€μš΄λ‘œλ“œ 방법 **Huggingface CLI μ‚¬μš©**: ```bash pip install -U huggingface_hub huggingface-cli download bigdefence/Bigvox-Exaone4-Audio --local-dir ./checkpoints ``` **Snapshot Download μ‚¬μš©**: ```bash pip install -U huggingface_hub ``` ```python from huggingface_hub import snapshot_download snapshot_download( repo_id="bigdefence/Bigvox-Exaone4-Audio", local_dir="./checkpoints", resume_download=True ) ``` **Git μ‚¬μš©**: ```bash git lfs install git clone https://huggingface.co/bigdefence/Bigvox-Exaone4-Audio ``` ### πŸ› οΈ μ˜μ‘΄μ„± λͺ¨λΈ - **Speech Encoder**: [Whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) 🎀 ### πŸ”„ 둜컬 μΆ”λ‘  **Bigvox**으둜 좔둠을 μˆ˜ν–‰ν•˜λ €λ©΄ λ‹€μŒ 단계λ₯Ό 따라 λͺ¨λΈμ„ μ„€μ •ν•˜κ³  λ‘œμ»¬μ—μ„œ μ‹€ν–‰ν•˜μ„Έμš”. πŸ“‘ 1. **λͺ¨λΈ μ€€λΉ„**: - [HuggingFace](https://huggingface.co/bigdefence/Bigvox-Exaone4-Audio)μ—μ„œ **Bigvox** λ‹€μš΄λ‘œλ“œ πŸ“¦ - [HuggingFace](https://huggingface.co/openai/whisper-large-v3)μ—μ„œ **Whisper-large-v3** μŒμ„± 인코더λ₯Ό λ‹€μš΄λ‘œλ“œν•˜μ—¬ `./models/speech_encoder/` 디렉토리에 배치 🎀 2. **μΆ”λ‘  μ‹€ν–‰**: - **μŒμ„±-ν…μŠ€νŠΈ(S2T)** μΆ”λ‘ :
- **Non-streaming** ```bash python3 omni_speech/infer/bigvox.py --query_audio test_audio.wav ``` - **Streaming** ```bash python3 omni_speech/infer/bigvox_streaming.py --query_audio test_audio.wav ``` ## πŸ”§ ν›ˆλ ¨ 세뢀사항 ### 데이터셋 - **VoiceAssistant**: ν•œκ΅­μ–΄ λŒ€ν™” μŒμ„± 데이터 ### ν›ˆλ ¨ μ„€μ • - **Base Model**: LGAI-EXAONE/EXAONE-4.0-1.2B - **Hardware**: 1x NVIDIA RTX 6000A GPU - **Training Time**: 4μ‹œκ°„ ## ⚠️ μ œν•œμ‚¬ν•­ - λ°°κ²½ μ†ŒμŒμ΄ μ‹¬ν•œ ν™˜κ²½μ—μ„œλŠ” μ„±λŠ₯이 μ €ν•˜λ  수 μžˆμŠ΅λ‹ˆλ‹€ - 맀우 λΉ λ₯Έ λ°œν™”λ‚˜ μ€‘μ–Όκ±°λ¦¬λŠ” λ§νˆ¬μ— λŒ€ν•΄μ„œλŠ” 인식λ₯ μ΄ λ–¨μ–΄μ§ˆ 수 μžˆμŠ΅λ‹ˆλ‹€ - μ „λ¬Έ μš©μ–΄λ‚˜ 고유λͺ…사에 λŒ€ν•œ 인식λ₯ μ€ 도메인에 따라 차이가 μžˆμ„ 수 μžˆμŠ΅λ‹ˆλ‹€ ## πŸ“ž λ¬Έμ˜μ‚¬ν•­ - **개발**: BigDefence ## πŸ“ˆ μ—…λ°μ΄νŠΈ 둜그 ### v1.0.0 (2024.12) - πŸŽ‰ **초기 λͺ¨λΈ 릴리즈**: Bigvox 곡개 - πŸ‡°πŸ‡· **ν•œκ΅­μ–΄ νŠΉν™”**: LGAI-EXAONE/EXAONE-4.0-1.2B 기반 ν•œκ΅­μ–΄ μŒμ„±-ν…μŠ€νŠΈ μŒμ„± λ©€ν‹°λͺ¨λ‹¬ λͺ¨λΈ --- ## 🀝 κΈ°μ—¬ν•˜κΈ° **Bigvox** ν”„λ‘œμ νŠΈμ— κΈ°μ—¬ν•˜κ³  μ‹ΆμœΌμ‹œλ‹€λ©΄: --- **BigDefence**와 ν•¨κ»˜ ν•œκ΅­μ–΄ AI μŒμ„± μΈμ‹μ˜ 미래λ₯Ό λ§Œλ“€μ–΄κ°€μ„Έμš”! πŸš€πŸ‡°πŸ‡· *"Every voice matters, every word counts - λͺ¨λ“  λͺ©μ†Œλ¦¬κ°€ μ€‘μš”ν•˜κ³ , λͺ¨λ“  말이 κ°€μΉ˜ μžˆμŠ΅λ‹ˆλ‹€"*