IDEA-Emdoor
/

DistilCodec-v1.0

Model card Files Files and versions Community

Ray0323 commited on 11 days ago

Commit

64af293

verified ·

1 Parent(s): e308122

Update README.md

Browse files

Files changed (1) hide show

README.md +35 -13

README.md CHANGED Viewed

@@ -11,11 +11,13 @@ license: cc-by-nc-4.0
     <p>
     </p>
     </p>
-    <a href="https" style="color:red">Paper </a> |
-    <a href="https://huggingface.co/IDEA-Emdoor/DistilCodec-v1.0" style="color:#FFD700">Hugging Face Model</a>
     <a href="https://github.com/IDEA-Emdoor-Lab/DistilCodec" style="color:gray">Code</a>
      <p>
         <img src="./idea_logo.png" alt="Institution 1" style="width: 200px; height: 60px;">
         <img src="./yidao_logo.png" alt="Institution 2" style="width: 200px; height: 60px;">
         <img src="./yijiayiban.png" alt="Institution 3" style="width: 200px; height: 60px;">
     </p>
@@ -23,9 +25,9 @@ license: cc-by-nc-4.0
 # 🔥 News
-- *2025.05.27*: We release DistilCodec-v1.0 checkpoint on [huggingface](https://huggingface.co/IDEA-Emdoor/DistilCodec-v1.0).
-- *2025.05.26*: We release the code of DistilCodec-v1.0, including training and inference.
-- *2025.05.24*: We release UniTTS and DistilCodec on [arxiv](https://arxiv.org/abs/2408.16532).
 ## Introduction of DistilCodec
 The Joint Laboratory of International Digital Economy Academy (IDEA) and Emdoor, in collaboration with Emdoor Information Technology Co., Ltd., and Shenzhen Yijiayiban Information Technology Co., Ltd, has launched DistilCodec - A Single-Codebook Neural Audio Codec (NAC) with 32768 codes trained on uniersal audio.The foundational network architecture of DistilCodec adopts an Encoder-VQ-Decoder framework
@@ -121,18 +123,38 @@ codec.save_wav(
 ## Available DistilCodec models
 |Model Version| Huggingface |  Corpus  |  Token/s  | Domain |
 |-----------------------|---------|---------------|---------------|-----------------------------------|
-| DistilCodec-v1.0 | [HF](https://huggingface.co/IDEA-Emdoor/DistilCodec-v1.0) | Universal Audio | 93 |  Audiobook、Speech、Audio Effects |
 ## Citation
-If you find this code useful in your research, please cite our work:
 ```
-@article{wang2025unitts,
-  title={UniTTS: An end-to-end TTS system without decoupling of acoustic and semantic information},
-  author={Rui Wang,Qianguo Sun,Tianrong Chen,Zhiyun Zeng,Junlong Wu,Jiaxing Zhang},
-  journal={arXiv preprint arXiv:2408.16532},
-  year={2025}
 }
-```

     <p>
     </p>
     </p>
+    <a href="https://arxiv.org/abs/2505.17426" style="color:red">Paper </a> |
+    <a href="https://huggingface.co/IDEA-Emdoor/DistilCodec-v1.0" style="color:#FFD700">HuggingFace Model</a> |
     <a href="https://github.com/IDEA-Emdoor-Lab/DistilCodec" style="color:gray">Code</a>
      <p>
         <img src="./idea_logo.png" alt="Institution 1" style="width: 200px; height: 60px;">
+     </p>
+     <p>
         <img src="./yidao_logo.png" alt="Institution 2" style="width: 200px; height: 60px;">
         <img src="./yijiayiban.png" alt="Institution 3" style="width: 200px; height: 60px;">
     </p>
 # 🔥 News
+- *2025.05.26*: We release DistilCodec-v1.0 checkpoint on [huggingface](https://huggingface.co/IDEA-Emdoor/DistilCodec-v1.0).
+- *2025.05.26*: The paper is available on [arxiv](https://arxiv.org/abs/2505.17426).
+- *2025.05.23*: We submit paper to arxiv.
 ## Introduction of DistilCodec
 The Joint Laboratory of International Digital Economy Academy (IDEA) and Emdoor, in collaboration with Emdoor Information Technology Co., Ltd., and Shenzhen Yijiayiban Information Technology Co., Ltd, has launched DistilCodec - A Single-Codebook Neural Audio Codec (NAC) with 32768 codes trained on uniersal audio.The foundational network architecture of DistilCodec adopts an Encoder-VQ-Decoder framework
 ## Available DistilCodec models
 |Model Version| Huggingface |  Corpus  |  Token/s  | Domain |
 |-----------------------|---------|---------------|---------------|-----------------------------------|
+| DistilCodec-v1.0 | [HuggingFace](https://huggingface.co/IDEA-Emdoor/DistilCodec-v1.0) | Universal Audio | 93 | Universal Audio |
 ## Citation
+If you find our work useful in your research, please cite our work:
 ```
+@misc{wang2025unittsendtoendttsdecoupling,
+      title={UniTTS: An end-to-end TTS system without decoupling of acoustic and semantic information},
+      author={Rui Wang and Qianguo Sun and Tianrong Chen and Zhiyun Zeng and Junlong Wu and Jiaxing Zhang},
+      year={2025},
+      eprint={2505.17426},
+      archivePrefix={arXiv},
+      primaryClass={cs.SD},
+      url={https://arxiv.org/abs/2505.17426},
 }
+```
+## Disclaimer
+DistilCodec provides the capability of universal audio discretion only for academic research purposes. We encourage the community to uphold safety and ethical principles in AI research and applications.
+Important Notes:
+- Compliance with the model's open-source license is mandatory.
+- Unauthorized voice replication applications are strictly prohibited.
+- Developers bear no responsibility for any misuse of this model.
+## License
+<a href="https://arxiv.org/abs/2505.17426">UniTTS: An end-to-end TTS system without decoupling of acoustic and semantic information</a> © 2025 by <a href="https://creativecommons.org">Rui Wang, Qianguo Sun, Tianrong Chen, Zhiyun Zeng, Junlong Wu, Jiaxing Zhang</a> is licensed under <a href="https://creativecommons.org/licenses/by-nc-nd/4.0/">CC BY-NC-ND 4.0</a><img src="https://mirrors.creativecommons.org/presskit/icons/cc.svg" style="max-width: 1em;max-height:1em;margin-left: .2em;"><img src="https://mirrors.creativecommons.org/presskit/icons/by.svg" style="max-width: 1em;max-height:1em;margin-left: .2em;"><img src="https://mirrors.creativecommons.org/presskit/icons/nc.svg" style="max-width: 1em;max-height:1em;margin-left: .2em;"><img src="https://mirrors.creativecommons.org/presskit/icons/nd.svg" style="max-width: 1em;max-height:1em;margin-left: .2em;">