Text-to-Speech
Safetensors
English
Chinese
spark-tts commited on
Commit
56ce01d
Β·
1 Parent(s): 9777ed1

update readme

Browse files
Files changed (1) hide show
  1. README.md +25 -152
README.md CHANGED
@@ -1,5 +1,11 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
3
  ---
4
 
5
 
@@ -8,7 +14,7 @@ license: apache-2.0
8
  Spark-TTS
9
  </h1>
10
  <p>
11
- Official PyTorch code for inference of <br>
12
  <b><em>Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens</em></b>
13
  </p>
14
  <p>
@@ -16,16 +22,29 @@ license: apache-2.0
16
  </p>
17
  <p>
18
  </p>
19
- <a href="https://sparkaudio.github.io/spark-tts/"><img src="https://img.shields.io/badge/Demo-Page-lightgrey" alt="version"></a>
20
- <a href="https://github.com/SparkAudio/Spark-TTS"><img src="https://img.shields.io/badge/Platform-linux-lightgrey" alt="version"></a>
21
- <a href="https://github.com/SparkAudio/Spark-TTS"><img src="https://img.shields.io/badge/Python-3.12+-orange" alt="version"></a>
22
- <a href="https://github.com/SparkAudio/Spark-TTS"><img src="https://img.shields.io/badge/PyTorch-2.5+-brightgreen" alt="python"></a>
23
- <a href="https://github.com/SparkAudio/Spark-TTS"><img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="mit"></a>
 
 
 
 
 
 
 
 
 
 
 
24
  </div>
25
 
26
 
27
  ## Spark-TTS πŸ”₯
28
 
 
 
29
  ### Overview
30
 
31
  Spark-TTS is an advanced text-to-speech system that uses the power of large language models (LLM) for highly accurate and natural-sounding voice synthesis. It is designed to be efficient, flexible, and powerful for both research and production use.
@@ -118,152 +137,6 @@ You can start the UI interface by running `python webui.py`, which allows you to
118
  | ![Image 1](src/figures/gradio_TTS.png) | ![Image 2](src/figures/gradio_control.png) |
119
 
120
 
121
- ## **Demos**
122
-
123
- Here are some demos generated by Spark-TTS using zero-shot voice cloning. For more demos, visit our [demo page](https://spark-tts.github.io/).
124
-
125
- ---
126
-
127
- <table>
128
- <tr>
129
- <td align="center">
130
-
131
- **Donald Trump**
132
- </td>
133
- <td align="center">
134
-
135
- **Zhongli (Genshin Impact)**
136
- </td>
137
- </tr>
138
-
139
- <tr>
140
- <td align="center">
141
-
142
- [Donald Trump](https://github.com/user-attachments/assets/fb225780-d9fe-44b2-9b2e-54390cb3d8fd)
143
-
144
- </td>
145
- <td align="center">
146
-
147
- [Zhongli](https://github.com/user-attachments/assets/80eeb9c7-0443-4758-a1ce-55ac59e64bd6)
148
-
149
- </td>
150
- </tr>
151
- </table>
152
-
153
- ---
154
-
155
- <table>
156
-
157
- <tr>
158
- <td align="center">
159
-
160
- **ι™ˆι²θ±« Chen Luyu**
161
- </td>
162
- <td align="center">
163
-
164
- **杨澜 Yang Lan**
165
- </td>
166
- </tr>
167
-
168
- <tr>
169
- <td align="center">
170
-
171
- [ι™ˆι²θ±«Chen_Luyu.webm](https://github.com/user-attachments/assets/5c6585ae-830d-47b1-992d-ee3691f48cf4)
172
- </td>
173
- <td align="center">
174
-
175
- [Yang_Lan.webm](https://github.com/user-attachments/assets/2fb3d00c-abc3-410e-932f-46ba204fb1d7)
176
- </td>
177
- </tr>
178
- </table>
179
-
180
- ---
181
-
182
-
183
- <table>
184
- <tr>
185
- <td align="center">
186
-
187
- **δ½™ζ‰ΏδΈœ Richard Yu**
188
- </td>
189
- <td align="center">
190
-
191
- **马云 Jack Ma**
192
- </td>
193
- </tr>
194
-
195
- <tr>
196
- <td align="center">
197
-
198
- [Yu_Chengdong.webm](https://github.com/user-attachments/assets/78feca02-84bb-4d3a-a770-0cfd02f1a8da)
199
-
200
- </td>
201
- <td align="center">
202
-
203
- [Ma_Yun.webm](https://github.com/user-attachments/assets/2d54e2eb-cec4-4c2f-8c84-8fe587da321b)
204
-
205
- </td>
206
- </tr>
207
- </table>
208
-
209
- ---
210
-
211
-
212
- <table>
213
- <tr>
214
- <td align="center">
215
-
216
- **刘德华 Andy Lau**
217
- </td>
218
- <td align="center">
219
-
220
- **εΎεΏ—θƒœ Xu Zhisheng**
221
- </td>
222
- </tr>
223
-
224
- <tr>
225
- <td align="center">
226
-
227
- [Liu_Dehua.webm](https://github.com/user-attachments/assets/195b5e97-1fee-4955-b954-6d10fa04f1d7)
228
-
229
- </td>
230
- <td align="center">
231
-
232
- [Xu_Zhisheng.webm](https://github.com/user-attachments/assets/dd812af9-76bd-4e26-9988-9cdb9ccbb87b)
233
-
234
- </td>
235
- </tr>
236
- </table>
237
-
238
-
239
- ---
240
-
241
- <table>
242
- <tr>
243
- <td align="center">
244
-
245
- **ε“ͺ吒 Nezha**
246
- </td>
247
- <td align="center">
248
-
249
- **ζŽι– Li Jing**
250
- </td>
251
- </tr>
252
-
253
- <tr>
254
- <td align="center">
255
-
256
- [Ne_Zha.webm](https://github.com/user-attachments/assets/8c608037-a17a-46d4-8588-4db34b49ed1d)
257
- </td>
258
- <td align="center">
259
-
260
- [Li_Jing.webm](https://github.com/user-attachments/assets/aa8ba091-097c-4156-b4e3-6445da5ea101)
261
-
262
- </td>
263
- </tr>
264
- </table>
265
-
266
-
267
  ## To-Do List
268
 
269
  - [ ] Release the Spark-TTS paper.
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
+ - zh
6
+ tags:
7
+ - text-to-speech
8
+ library_tag: spark-tts
9
  ---
10
 
11
 
 
14
  Spark-TTS
15
  </h1>
16
  <p>
17
+ Official model for <br>
18
  <b><em>Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens</em></b>
19
  </p>
20
  <p>
 
22
  </p>
23
  <p>
24
  </p>
25
+ <a href="https://sparkaudio.github.io/spark-tts/">
26
+ <img src="https://img.shields.io/badge/Demo-Page-lightgrey" alt="version">
27
+ </a>&nbsp;
28
+ <a href="https://github.com/SparkAudio/Spark-TTS">
29
+ <img src="https://img.shields.io/badge/GitHub-Repo-black?logo=github" alt="GitHub Repo">
30
+ </a>&nbsp;
31
+ <a href="https://github.com/SparkAudio/Spark-TTS">
32
+ <img src="https://img.shields.io/badge/Python-3.12+-orange" alt="version">
33
+ </a>&nbsp;
34
+ <a href="https://github.com/SparkAudio/Spark-TTS">
35
+ <img src="https://img.shields.io/badge/PyTorch-2.5+-brightgreen" alt="python">
36
+ </a>&nbsp;
37
+ <a href="https://github.com/SparkAudio/Spark-TTS">
38
+ <img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="mit">
39
+ </a>
40
+
41
  </div>
42
 
43
 
44
  ## Spark-TTS πŸ”₯
45
 
46
+ ### πŸ‘‰πŸ» [Spark-TTS Demos](https://sparkaudio.github.io/spark-tts/) πŸ‘ˆπŸ»
47
+
48
  ### Overview
49
 
50
  Spark-TTS is an advanced text-to-speech system that uses the power of large language models (LLM) for highly accurate and natural-sounding voice synthesis. It is designed to be efficient, flexible, and powerful for both research and production use.
 
137
  | ![Image 1](src/figures/gradio_TTS.png) | ![Image 2](src/figures/gradio_control.png) |
138
 
139
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
140
  ## To-Do List
141
 
142
  - [ ] Release the Spark-TTS paper.