Update README.md
Browse files
README.md
CHANGED
@@ -16,42 +16,75 @@ base_model:
|
|
16 |
pipeline_tag: text-to-speech
|
17 |
---
|
18 |
|
19 |
-
# IndicF5-TTS
|
20 |
|
21 |
-
|
22 |
-
would be cool if someone else can compare and put it up earlier though.
|
23 |
|
24 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
|
26 |
-
|
27 |
-
Sharing an excerpt from the repo here (https://huggingface.co/wgs/F5-TTS-Faster)
|
28 |
-
The entire project workflow can be summarized as follows:
|
29 |
|
30 |
-
|
31 |
-
|
32 |
-
|
|
|
|
|
33 |
|
34 |
-
|
35 |
-
|
|
|
|
|
36 |
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
- Setup environment for tensorrt-llm models to run. (Follow the repo instructions, a little convoluted.)
|
41 |
-
- Place the ckpts folder at the root location.
|
42 |
-
- Run the sample file inside export_trtllm.
|
43 |
-
```
|
44 |
|
45 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
46 |
|
47 |
-
|
48 |
-
- https://huggingface.co/wgs/F5-TTS-Faster
|
49 |
-
- https://github.com/DakeQQ/F5-TTS-ONNX
|
50 |
-
- https://github.com/SWivid/F5-TTS
|
51 |
|
52 |
-
|
|
|
53 |
|
54 |
-
|
55 |
-
Unauthorized voice cloning is strictly prohibited. Any misuse of this model is the responsibility of the user.
|
56 |
|
57 |
-
|
|
|
16 |
pipeline_tag: text-to-speech
|
17 |
---
|
18 |
|
19 |
+
# IndicF5-TTS TensorRT-LLM Fast Inference – Zuppppppp 🚀
|
20 |
|
21 |
+
Fast inference version of IndicF5 TTS using TensorRT-LLM. Supports 11 Indian languages.
|
|
|
22 |
|
23 |
+
---
|
24 |
+
|
25 |
+
Accelerated inference for IndicF5 TTS – made for those of you who asked for speed! We heard you loud and clear. Sharing with ❤️ from **Saryps Labs**.
|
26 |
+
|
27 |
+
This build is based on the amazing work at [wgs/F5-TTS-Faster](https://huggingface.co/wgs/F5-TTS-Faster), which converts F5-TTS checkpoints to run with TensorRT-LLM.
|
28 |
+
|
29 |
+
### 🔧 How it works (a quick overview)
|
30 |
+
Here’s the basic workflow used for acceleration (as shared in [wgs/F5-TTS-Faster](https://huggingface.co/wgs/F5-TTS-Faster)):
|
31 |
+
|
32 |
+
- First, export **F5-TTS** to ONNX in three parts.
|
33 |
+
- Then, use **TensorRT-LLM** to rewrite the relevant **Transformer** parts of the network for acceleration.
|
34 |
+
- The front-end and decoder still use ONNX inference.
|
35 |
+
- You can also use `CUDAExecutionProvider`, `OpenVINOExecutionProvider`, etc., depending on your setup.
|
36 |
|
37 |
+
If you're curious about the details, dive into the [original repo](https://github.com/WGS-note/F5_TTS_Faster) or ping me — happy to expand on anything.
|
|
|
|
|
38 |
|
39 |
+
---
|
40 |
+
|
41 |
+
### 🚀 Quickstart
|
42 |
+
|
43 |
+
To try it out yourself:
|
44 |
|
45 |
+
1. Clone the [F5_TTS_Faster](https://github.com/WGS-note/F5_TTS_Faster) repo.
|
46 |
+
2. Set up the environment for TensorRT-LLM models (follow the repo instructions – it’s a little convoluted, but manageable).
|
47 |
+
3. Place the `ckpts` folder at the root of the project.
|
48 |
+
4. Run the sample script inside `export_trtllm/` to begin inference.
|
49 |
|
50 |
+
Benchmarks coming soon! If you try it out and get some numbers, would be awesome if you can share back 🫡
|
51 |
+
|
52 |
+
---
|
|
|
|
|
|
|
|
|
53 |
|
54 |
+
### 🗣️ Supported Languages
|
55 |
+
|
56 |
+
This model supports high-quality TTS synthesis in the following Indian languages:
|
57 |
+
|
58 |
+
- Hindi (`hi`)
|
59 |
+
- Telugu (`te`)
|
60 |
+
- Assamese (`as`)
|
61 |
+
- Bengali (`bn`)
|
62 |
+
- Gujarati (`gu`)
|
63 |
+
- Marathi (`mr`)
|
64 |
+
- Kannada (`kn`)
|
65 |
+
- Malayalam (`ml`)
|
66 |
+
- Odia (`or`)
|
67 |
+
- Punjabi (`pa`)
|
68 |
+
- Tamil (`ta`)
|
69 |
+
|
70 |
+
---
|
71 |
+
|
72 |
+
### 🙏 Credits
|
73 |
+
|
74 |
+
Massive thanks to the original authors and repositories that made this possible:
|
75 |
+
|
76 |
+
- https://huggingface.co/ai4bharat/IndicF5
|
77 |
+
- https://huggingface.co/wgs/F5-TTS-Faster
|
78 |
+
- https://github.com/DakeQQ/F5-TTS-ONNX
|
79 |
+
- https://github.com/SWivid/F5-TTS
|
80 |
+
|
81 |
+
---
|
82 |
|
83 |
+
### 📜 Terms of Use
|
|
|
|
|
|
|
84 |
|
85 |
+
By using this model, you agree to only clone voices for which you have **explicit permission**.
|
86 |
+
Unauthorized voice cloning is **strictly prohibited**. Any misuse of this model is the **sole responsibility of the user**.
|
87 |
|
88 |
+
Please don’t misuse this. Let’s build cool stuff, not cause trouble.
|
|
|
89 |
|
90 |
+
---
|