qiangchunyu commited on
Commit
10ab439
Β·
verified Β·
1 Parent(s): ea1cfe3

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ tags:
4
+ - audio
5
+ - speech-processing
6
+ - speech-codec
7
+ - low-bitrate
8
+ - streaming
9
+ - tts
10
+ - cross-modal
11
+ license: apache-2.0
12
+ ---
13
+
14
+ # SecoustiCodec: Cross-Modal Aligned Streaming Single-Codecbook Speech Codec
15
+
16
+ ## Resources
17
+ - [πŸ“„ Research Paper](https://arxiv.org/abs/2508.02849)
18
+ - [πŸ’» Source Code](https://github.com/QiangChunyu/SecoustiCodec)
19
+ - [πŸ€— Demo Page](https://qiangchunyu.github.io/SecoustiCodec_Page/)
20
+
21
+ ## Model Overview
22
+
23
+ SecoustiCodec is a state-of-the-art **low-bitrate streaming speech codec** that achieves good performance in speech reconstruction at ultra-low bitrates (0.27-1 kbps). The model introduces several innovations:
24
+
25
+ - 🧠 **Cross-modal alignment**: Aligns text and speech in joint multimodal frame-level space
26
+ - πŸ” **Semantic-paralinguistic disentanglement**: Separates linguistic content from speaker characteristics
27
+ - ⚑ **Streaming support**: Real-time processing capabilities
28
+ - πŸ“Š **Efficient quantization**: VAE+FSQ approach solves token distribution problems
29
+ - 🎯 **Acoustic-constrained optimization**: Ensures stable convergence
30
+
31
+
32
+
33
+ ## Architecture Overview
34
+
35
+ ![Model Architecture](https://qiangchunyu.github.io/SecoustiCodec_Page/model.png)
36
+
37
+
38
+ ## Acknowledgments
39
+ - We used [HiFiGAN](https://github.com/jik876/hifi-gan) for efficient waveform generation
40
+ - We referred to [MIMICodec](https://huggingface.co/kyutai/mimi) to implement this.
41
+
42
+
43
+ ## Citation
44
+ ```bibtex
45
+ @article{qiang2025secousticodec,
46
+ title={SecoustiCodec: Cross-Modal Aligned Streaming Single-Codecbook Speech Codec},
47
+ author={Qiang, Chunyu and Wang, Haoyu and Gong, Cheng and Wang, Tianrui and Fu, Ruibo and Wang, Tao and Chen, Ruilong and Yi, Jiangyan and Wen, Zhengqi and Zhang, Chen and Wang, Longbiao and Dang, Jianwu and Tao, Jianhua},
48
+ journal={arXiv preprint arXiv:2508.02849},
49
+ year={2025}
50
+ }
51
+ ```