Spaces:
Running
Running
Commit
·
9d1f8e0
1
Parent(s):
be6dcb3
Add application file
Browse files- README.md +94 -14
- VOCALIS/__init__.py +2 -0
- VOCALIS/agent.py +5 -0
- VOCALIS/task.py +121 -0
- app.py +155 -0
- edgeTTsLang.py +272 -0
- requirements.txt +5 -0
README.md
CHANGED
@@ -1,14 +1,94 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# 🎙️ AI VoiceCraft: Text-to-Speech Studio 🚀
|
2 |
+
|
3 |
+
## Overview
|
4 |
+
|
5 |
+
AI VoiceCraft is a powerful web application built with Gradio that leverages cutting-edge AI to generate dynamic text content and transform it into natural-sounding speech. This tool integrates the Gemini AI model for content generation and Microsoft Edge TTS for high-quality audio synthesis.
|
6 |
+
|
7 |
+
## Features
|
8 |
+
|
9 |
+
- **Dynamic Content Generation:**
|
10 |
+
- Generate various content types, including stories, news, podcasts, and more.
|
11 |
+
- Customize content length, theme, and style.
|
12 |
+
- Utilize Gemini AI for creative and contextually relevant text output.
|
13 |
+
- **High-Quality Text-to-Speech:**
|
14 |
+
- Leverage Microsoft Edge TTS for realistic voice synthesis.
|
15 |
+
- Support for multiple languages and voices.
|
16 |
+
- Fine-tune speech rate and pitch for optimal delivery.
|
17 |
+
- **User-Friendly Interface:**
|
18 |
+
- Intuitive Gradio interface for easy navigation and control.
|
19 |
+
- Real-time feedback and error handling.
|
20 |
+
- Attractive theme applied for better user experience.
|
21 |
+
- **Customization Options:**
|
22 |
+
- Adjust the creativity level of the AI content generation.
|
23 |
+
- Input custom prompts for fine tuning the AI outputs.
|
24 |
+
- Adjust speech rate and pitch to fit your needs.
|
25 |
+
|
26 |
+
## Getting Started
|
27 |
+
|
28 |
+
### Prerequisites
|
29 |
+
|
30 |
+
- Python 3.7+
|
31 |
+
- Internet connection (for API access and TTS)
|
32 |
+
- API Key for Gemini Model.
|
33 |
+
|
34 |
+
### Installation
|
35 |
+
|
36 |
+
1. Clone the repository:
|
37 |
+
|
38 |
+
```bash
|
39 |
+
git clone <repository_url>
|
40 |
+
cd <repository_directory>
|
41 |
+
```
|
42 |
+
|
43 |
+
2. Install the required Python packages:
|
44 |
+
|
45 |
+
```bash
|
46 |
+
pip install gradio requests edge-tts google-generativeai nest_asyncio
|
47 |
+
```
|
48 |
+
3. set your API key in the VOCALIS.py file.
|
49 |
+
4. Run the application:
|
50 |
+
|
51 |
+
```bash
|
52 |
+
python app.py
|
53 |
+
```
|
54 |
+
|
55 |
+
|
56 |
+
5. Open your web browser and navigate to the local URL provided by Gradio (usually `http://127.0.0.1:7860`).
|
57 |
+
|
58 |
+
## Usage
|
59 |
+
|
60 |
+
1. Select the desired content type from the dropdown menu.
|
61 |
+
2. Choose the language and voice for the TTS output.
|
62 |
+
3. Adjust the output style, content length, and theme as needed.
|
63 |
+
4. Enter any custom text or instructions in the customization field.
|
64 |
+
5. Adjust the speech rate and pitch using the sliders.
|
65 |
+
6. Click the "Submit" button to generate the text and audio.
|
66 |
+
7. Review the generated text and listen to the audio output.
|
67 |
+
|
68 |
+
## Code Structure
|
69 |
+
|
70 |
+
- `your_script_name.py`: Main application script that integrates Gradio, content generation, and TTS.
|
71 |
+
- `VOCALIS.py`: Contains the `Agent` and `ContentGenerator` classes for AI content generation.
|
72 |
+
- `edgeTTsLang.py`: Dictionary containing the language and voice codes for Microsoft Edge TTS.
|
73 |
+
|
74 |
+
## Dependencies
|
75 |
+
|
76 |
+
- `gradio`: For building the web interface.
|
77 |
+
- `requests`: For making HTTP requests to the API.
|
78 |
+
- `edge-tts`: For text-to-speech conversion.
|
79 |
+
- `google-generativeai`: For interacting with the Gemini AI model.
|
80 |
+
- `asyncio`: For asynchronous operations.
|
81 |
+
- `nest_asyncio`: For handling nested asyncio events in Jupyter notebooks.
|
82 |
+
|
83 |
+
## Contributing
|
84 |
+
|
85 |
+
Contributions are welcome! Please feel free to submit pull requests or open issues for bug fixes, feature requests, or improvements.
|
86 |
+
|
87 |
+
## License
|
88 |
+
|
89 |
+
This project is licensed under the MIT License.
|
90 |
+
|
91 |
+
## Gradio Theme
|
92 |
+
|
93 |
+
To enhance the user experience, an attractive theme has been applied to the Gradio interface. You can customize the theme further by modifying the Gradio theme settings in the `create_demo` function.
|
94 |
+
|
VOCALIS/__init__.py
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
from .agent import Agent
|
2 |
+
from .task import ContentGenerator
|
VOCALIS/agent.py
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
class Agent:
|
2 |
+
def __init__(self, model: str, temperature: float = 0.6, role: str = "Content Creator"):
|
3 |
+
self.model = model
|
4 |
+
self.temperature = temperature
|
5 |
+
self.role = role
|
VOCALIS/task.py
ADDED
@@ -0,0 +1,121 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from VOCALIS import Agent
|
2 |
+
import os
|
3 |
+
import logging
|
4 |
+
import re
|
5 |
+
import google.generativeai as genai
|
6 |
+
from google.generativeai.types import GenerationConfig
|
7 |
+
|
8 |
+
|
9 |
+
# Configure Gemini AI API
|
10 |
+
api_key = os.getenv("API_KEY")
|
11 |
+
if not api_key:
|
12 |
+
raise ValueError("API Key is missing. Set the API_KEY environment variable.")
|
13 |
+
genai.configure(api_key=api_key)
|
14 |
+
|
15 |
+
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
|
16 |
+
|
17 |
+
|
18 |
+
class ContentGenerator:
|
19 |
+
def __init__(self, agent: Agent, content_type: str = "story", language: str = "English",content_length: int = 200,
|
20 |
+
theme: str = "General/None", expectations: str = ""):
|
21 |
+
self.agent = agent
|
22 |
+
self.content_type = content_type.strip().lower()
|
23 |
+
self.language = language.strip()
|
24 |
+
self.goal = self._get_default_goal()
|
25 |
+
self.content_length = content_length # Added content length
|
26 |
+
self.theme = theme.strip()
|
27 |
+
self.expectations = expectations.strip()
|
28 |
+
|
29 |
+
# Input validation
|
30 |
+
if self.content_type not in [
|
31 |
+
"story", "social", "news", "motivational", "explainer", "advertisement", "interview", "podcast",
|
32 |
+
"testimonial", "comedy", "audiobook", "documentary", "meditation", "education", "poem", "recipe", "script",
|
33 |
+
"summary", "email", "blog"
|
34 |
+
]:
|
35 |
+
raise ValueError(f"Invalid content type: {self.content_type}")
|
36 |
+
# if self.language not in languages:
|
37 |
+
# raise ValueError(f"Invalid language: {self.language}")
|
38 |
+
|
39 |
+
def _get_default_goal(self) -> str:
|
40 |
+
default_goals = {
|
41 |
+
"story": "Generate a vivid, engaging, and natural-sounding short story suitable for narration.",
|
42 |
+
"social": "Create a casual, engaging, and conversational social media script that sounds authentic.",
|
43 |
+
"news": "Write a professional and well-structured news report optimized for audio presentation.",
|
44 |
+
"motivational": "Generate an inspiring and natural motivational speech with a strong emotional connection.",
|
45 |
+
"explainer": "Break down a complex topic in a clear and engaging way, suitable for an audio explanation.",
|
46 |
+
"advertisement": "Write a persuasive and compelling ad script that feels engaging and natural.",
|
47 |
+
"interview": "Generate a structured, conversational interview with natural question-answer flow.",
|
48 |
+
"podcast": "Write a structured podcast script with natural dialogue and engaging discussions.",
|
49 |
+
"testimonial": "Create an authentic-sounding customer testimonial suitable for an audio review.",
|
50 |
+
"comedy": "Write a humorous monologue or short sketch with a natural comedic timing.",
|
51 |
+
"audiobook": "Generate a structured audiobook chapter with expressive dialogue and immersive narration.",
|
52 |
+
"documentary": "Create a professional and informative documentary narration with a storytelling approach.",
|
53 |
+
"meditation": "Write a soothing guided meditation script designed for relaxation and mindfulness.",
|
54 |
+
"education": "Generate a structured and clear educational script that is easy to follow in an audio format.",
|
55 |
+
"poem": "Generate a beautiful and expressive poem with a natural flow.",
|
56 |
+
"recipe": "Write a clear and easy-to-follow recipe suitable for audio instructions.",
|
57 |
+
"script": "Generate a well-structured script for a short video or audio segment.",
|
58 |
+
"summary": "Create a concise and accurate summary of a given topic.",
|
59 |
+
"email": "Write a professional and well-formatted email.",
|
60 |
+
"blog": "Generate an engaging and informative blog post."
|
61 |
+
}
|
62 |
+
return default_goals.get(self.content_type,
|
63 |
+
"Generate a vivid, engaging, and natural-sounding short story suitable for narration.")
|
64 |
+
|
65 |
+
def _build_prompt(self) -> str:
|
66 |
+
prompt = (
|
67 |
+
f"Role: You are a professional voice-over script writer specializing in {self.content_type} generation for natural speech synthesis.\n"
|
68 |
+
f"Task: Create a high-quality, natural-sounding script in {self.language} optimized for text-to-speech (TTS).\n"
|
69 |
+
f"Tone: Maintain a conversational and engaging tone, as if speaking directly to a listener.\n"
|
70 |
+
f"Structure: Use short, clear sentences. Organize the content into logical paragraphs for easy audio comprehension.\n"
|
71 |
+
f"Goal: {self.goal}\n"
|
72 |
+
f"Constraints:\n"
|
73 |
+
f"- Keep the script under {self.content_length} words.\n"
|
74 |
+
f"- Use simple, direct language. Avoid complex jargon or unusual words that may be mispronounced by TTS.\n"
|
75 |
+
f"- Do not explicitly state the content type (e.g., 'This is a story', 'Here is a script for voice-over...', etc.).\n"
|
76 |
+
f"- Avoid excessive use of abbreviations, as they may not be pronounced correctly by TTS.\n"
|
77 |
+
f"- Ensure smooth sentence transitions to maintain a natural flow when spoken aloud.\n"
|
78 |
+
|
79 |
+
f"Instructions for Natural Pacing and Pauses:\n"
|
80 |
+
f"- Use punctuation strategically (commas, ellipses, and dashes) to guide pauses in speech.\n"
|
81 |
+
f"- Insert line breaks between key ideas to improve speech rhythm and avoid monotony.\n"
|
82 |
+
f"- Break down long sentences into shorter, more digestible phrases to improve clarity.\n"
|
83 |
+
|
84 |
+
f"Instructions for Emphasis:\n"
|
85 |
+
f"- Use ALL CAPS or spacing between letters for words that should be emphasized.\n"
|
86 |
+
f"- Provide phonetic hints for difficult or unusual words if necessary.\n"
|
87 |
+
|
88 |
+
f"Output:\n"
|
89 |
+
f"- Return ONLY the generated script. Do not include any introductory phrases like 'Here is a script...' or explanations.\n"
|
90 |
+
)
|
91 |
+
|
92 |
+
if self.theme and self.theme != "General/None":
|
93 |
+
prompt += f"Theme/Nature: {self.theme}\n"
|
94 |
+
|
95 |
+
if self.expectations:
|
96 |
+
prompt += f"User Expectations: {self.expectations}\n"
|
97 |
+
|
98 |
+
return prompt
|
99 |
+
|
100 |
+
|
101 |
+
|
102 |
+
def generate_content(self) -> str:
|
103 |
+
try:
|
104 |
+
model = genai.GenerativeModel(self.agent.model)
|
105 |
+
prompt = self._build_prompt()
|
106 |
+
contents = [{"parts": [{"text": prompt}]}]
|
107 |
+
generation_config = GenerationConfig(temperature=self.agent.temperature, max_output_tokens=1024)
|
108 |
+
|
109 |
+
response = model.generate_content(contents=contents, generation_config=generation_config)
|
110 |
+
output = response.text
|
111 |
+
|
112 |
+
output = output.strip()
|
113 |
+
output = re.sub(r'\s+', ' ', output)
|
114 |
+
|
115 |
+
return output
|
116 |
+
|
117 |
+
except Exception as e:
|
118 |
+
logging.error(f"Error generating content: {e}")
|
119 |
+
return f"Generation failed: {e}"
|
120 |
+
|
121 |
+
|
app.py
ADDED
@@ -0,0 +1,155 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import gradio as gr
|
2 |
+
import asyncio
|
3 |
+
import tempfile
|
4 |
+
import logging
|
5 |
+
import requests
|
6 |
+
from VOCALIS import Agent, ContentGenerator
|
7 |
+
from edgeTTsLang import languages
|
8 |
+
|
9 |
+
|
10 |
+
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
|
11 |
+
logger = logging.getLogger(__name__)
|
12 |
+
|
13 |
+
def generate_the_content(content_type, language,output_style,content_length, theme, expectations):
|
14 |
+
try:
|
15 |
+
temperature_map = {
|
16 |
+
"Precise (Deterministic)": 0.1,
|
17 |
+
"Very Focused (Low Randomness)": 0.3,
|
18 |
+
"Moderately Focused (Slight Randomness)": 0.4,
|
19 |
+
"Balanced (Moderate Creativity)": 0.5,
|
20 |
+
"Slightly Creative (Moderate Randomness)": 0.6,
|
21 |
+
"Creative (High Randomness)": 0.7,
|
22 |
+
"Highly Creative (Very High Randomness)": 0.8,
|
23 |
+
"Experimental (Maximum Randomness)": 0.95,
|
24 |
+
}
|
25 |
+
temperature = temperature_map.get(output_style, 0.6)
|
26 |
+
agent = Agent(model="gemini-2.0-flash", temperature=temperature)
|
27 |
+
generator = ContentGenerator(agent, content_type, language, content_length, theme, expectations)
|
28 |
+
output = generator.generate_content()
|
29 |
+
|
30 |
+
return output
|
31 |
+
|
32 |
+
except ValueError as ve:
|
33 |
+
return f"Input Error: {ve}"
|
34 |
+
except requests.exceptions.ConnectionError:
|
35 |
+
return "Network Error: Could not connect to API. Please check your internet connection."
|
36 |
+
except Exception as e:
|
37 |
+
return f"General Error: {e}"
|
38 |
+
|
39 |
+
async def text_to_speech(text, voice, rate, pitch):
|
40 |
+
import edge_tts
|
41 |
+
if not text.strip():
|
42 |
+
return None, "Please enter text to convert."
|
43 |
+
if not voice:
|
44 |
+
return None, "Please select a voice."
|
45 |
+
rate_str = f"{rate:+d}%"
|
46 |
+
pitch_str = f"{pitch:+d}Hz"
|
47 |
+
communicate = edge_tts.Communicate(text, voice, rate=rate_str, pitch=pitch_str)
|
48 |
+
with tempfile.NamedTemporaryFile(delete=False, suffix=".mp3") as tmp_file:
|
49 |
+
tmp_path = tmp_file.name
|
50 |
+
await communicate.save(tmp_path)
|
51 |
+
return tmp_path, None
|
52 |
+
|
53 |
+
async def tts_interface(content_type, language, voice, output_style, content_length, theme, Customization, rate, pitch):
|
54 |
+
text_output = generate_the_content(content_type, language, output_style, content_length, theme, Customization)
|
55 |
+
if text_output.startswith("Error:"):
|
56 |
+
return None, None, gr.Markdown(text_output)
|
57 |
+
|
58 |
+
audio_file, warning = await text_to_speech(text_output, languages[language][voice], rate, pitch)
|
59 |
+
|
60 |
+
if warning:
|
61 |
+
return text_output, gr.Markdown(warning)
|
62 |
+
|
63 |
+
return text_output, audio_file, None
|
64 |
+
|
65 |
+
def create_demo():
|
66 |
+
language_choices = list(languages.keys())
|
67 |
+
|
68 |
+
custom_theme = gr.themes.Soft(
|
69 |
+
primary_hue="indigo",
|
70 |
+
secondary_hue="blue",
|
71 |
+
neutral_hue="slate",
|
72 |
+
radius_size=gr.themes.sizes.radius_sm,
|
73 |
+
font=[gr.themes.GoogleFont("Montserrat"), "Arial", "sans-serif"],
|
74 |
+
)
|
75 |
+
|
76 |
+
demo = gr.Interface(
|
77 |
+
fn=tts_interface,
|
78 |
+
theme=custom_theme,
|
79 |
+
inputs=[
|
80 |
+
gr.Dropdown(label="Content Type", choices=[
|
81 |
+
"story", "social", "news", "motivational", "explainer", "advertisement", "interview", "podcast",
|
82 |
+
"testimonial", "comedy", "audiobook", "documentary", "meditation", "education", "poem", "recipe",
|
83 |
+
"script", "summary", "email", "blog"
|
84 |
+
], value="story"),
|
85 |
+
gr.Dropdown(label="Language", choices=language_choices, value=language_choices[0] if language_choices else ""),
|
86 |
+
gr.Dropdown(label="Voice", choices=["Female", "Male"], value="Female"),
|
87 |
+
gr.Dropdown(label="Output Style", choices=[
|
88 |
+
"Precise (Deterministic)", "Very Focused (Low Randomness)", "Moderately Focused (Slight Randomness)",
|
89 |
+
"Balanced (Moderate Creativity)", "Slightly Creative (Moderate Randomness)",
|
90 |
+
"Creative (High Randomness)", "Highly Creative (Very High Randomness)",
|
91 |
+
"Experimental (Maximum Randomness)"
|
92 |
+
], value="Balanced (Moderate Creativity)"),
|
93 |
+
gr.Slider(label="Content Length (Words)", minimum=100, maximum=1000, value=200, step=10),
|
94 |
+
gr.Dropdown(label="Theme/Nature (Optional)", choices=[
|
95 |
+
"General/None", "Narrative/Storytelling", "Informative/Educational", "Descriptive/Atmospheric",
|
96 |
+
"Persuasive/Argumentative", "Humorous/Comedic", "Emotional/Inspirational", "Technical/Scientific",
|
97 |
+
"Historical/Cultural", "Modern/Contemporary", "Futuristic/Sci-Fi", "Fantasy/Mythical",
|
98 |
+
"Mystery/Suspense", "Adventure/Exploration", "Realistic/Documentary", "Philosophical/Reflective",
|
99 |
+
"Social/Relational", "Environmental/Nature", "Personal/Anecdotal"
|
100 |
+
], value="General/None"),
|
101 |
+
gr.Textbox(label="Customization", placeholder="Add any extra information to help customize the generated content"),
|
102 |
+
gr.Slider(minimum=-50, maximum=50, value=0, label="Speech Rate Adjustment (%)", step=1),
|
103 |
+
gr.Slider(minimum=-20, maximum=20, value=0, label="Pitch Adjustment (Hz)", step=1)
|
104 |
+
],
|
105 |
+
outputs=[
|
106 |
+
gr.Textbox(label="Generated Text"),
|
107 |
+
gr.Audio(label="Generated Audio", type="filepath"),
|
108 |
+
gr.Markdown(label="Error/Warning", visible=True)
|
109 |
+
],
|
110 |
+
title="✨ AI VoiceCraft: Text-to-Speech Studio 🎙️",
|
111 |
+
description="""
|
112 |
+
🚀 Transform your text into captivating audio! 🚀
|
113 |
+
|
114 |
+
This tool generates AI-powered content and converts it into lifelike speech using Microsoft Edge TTS.
|
115 |
+
|
116 |
+
🔹 **Features at a Glance:**
|
117 |
+
🌍 Supports multiple languages and voices
|
118 |
+
🎚️ Adjust speech rate and pitch for natural delivery
|
119 |
+
📝 Generate dynamic content: stories, news, podcasts & more
|
120 |
+
🎭 Customize tone, length, and style to fit your needs
|
121 |
+
|
122 |
+
""",
|
123 |
+
article="""
|
124 |
+
# 🌟 Welcome to AI VoiceCraft! 🌟
|
125 |
+
|
126 |
+
**Unleash the power of AI-driven text-to-speech.**
|
127 |
+
|
128 |
+
This advanced application blends **cutting-edge AI content generation** with high-quality speech synthesis to create immersive audio experiences.
|
129 |
+
|
130 |
+
## 🎤 Key Highlights:
|
131 |
+
🔊 Natural and expressive voice output
|
132 |
+
📖 AI-powered script generation tailored for speech
|
133 |
+
⚙️ Fine-tune pitch, rate, and delivery style
|
134 |
+
|
135 |
+
🔗 [Discover more AI tools@MusabbirKM](https://www.example.com/ai-tools)
|
136 |
+
""",
|
137 |
+
|
138 |
+
allow_flagging="never",
|
139 |
+
api_name=None,
|
140 |
+
)
|
141 |
+
return demo
|
142 |
+
|
143 |
+
async def main():
|
144 |
+
demo = create_demo()
|
145 |
+
demo.queue(default_concurrency_limit=5)
|
146 |
+
demo.launch(show_api=False)
|
147 |
+
|
148 |
+
|
149 |
+
if __name__ == "__main__":
|
150 |
+
try:
|
151 |
+
asyncio.run(main())
|
152 |
+
except RuntimeError:
|
153 |
+
import nest_asyncio
|
154 |
+
nest_asyncio.apply()
|
155 |
+
asyncio.run(main())
|
edgeTTsLang.py
ADDED
@@ -0,0 +1,272 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
languages = {
|
2 |
+
"Malayalam (India)": {
|
3 |
+
"Female": "ml-IN-SobhanaNeural",
|
4 |
+
"Male": "ml-IN-MidhunNeural"
|
5 |
+
},
|
6 |
+
"Hindi (India)": {
|
7 |
+
"Female": "hi-IN-SwaraNeural",
|
8 |
+
"Male": "hi-IN-MadhurNeural"
|
9 |
+
},
|
10 |
+
"Kannada (India)": {
|
11 |
+
"Female": "kn-IN-SapnaNeural",
|
12 |
+
"Male": "kn-IN-GaganNeural"
|
13 |
+
},
|
14 |
+
"Tamil (India)": {
|
15 |
+
"Female": "ta-IN-PallaviNeural",
|
16 |
+
"Male": "ta-IN-ValluvarNeural"
|
17 |
+
},
|
18 |
+
"Telugu (India)": {
|
19 |
+
"Female": "te-IN-ShrutiNeural",
|
20 |
+
"Male": "te-IN-MohanNeural"
|
21 |
+
},
|
22 |
+
"Urdu (India)": {
|
23 |
+
"Female": "ur-IN-GulNeural",
|
24 |
+
"Male": "ur-IN-SarfarazNeural"
|
25 |
+
},
|
26 |
+
"Gujarati (India)": {
|
27 |
+
"Female": "gu-IN-DhwaniNeural",
|
28 |
+
"Male": "gu-IN-NiranjanNeural"
|
29 |
+
},
|
30 |
+
"Marathi (India)": {
|
31 |
+
"Female": "mr-IN-AarohiNeural",
|
32 |
+
"Male": "mr-IN-ManoharNeural"
|
33 |
+
},
|
34 |
+
"Odia (India)": {
|
35 |
+
"Female": "or-IN-TariniNeural",
|
36 |
+
"Male": "or-IN-BiswajitNeural"
|
37 |
+
},
|
38 |
+
"Punjabi (India)": {
|
39 |
+
"Female": "pa-IN-GagandeepNeural",
|
40 |
+
"Male": "pa-IN-NirvairNeural"
|
41 |
+
},
|
42 |
+
"Assamese (India)": {
|
43 |
+
"Female": "as-IN-PariNeural",
|
44 |
+
"Male": "as-IN-NiloyNeural"
|
45 |
+
},
|
46 |
+
"Afrikaans (South Africa)": {
|
47 |
+
"Female": "af-ZA-AdriNeural",
|
48 |
+
"Male": "af-ZA-WillemNeural"
|
49 |
+
},
|
50 |
+
"Albanian (Albania)": {
|
51 |
+
"Female": "sq-AL-AnilaNeural",
|
52 |
+
"Male": "sq-AL-IlirNeural"
|
53 |
+
},
|
54 |
+
"Amharic (Ethiopia)": {
|
55 |
+
"Female": "am-ET-MekdesNeural",
|
56 |
+
"Male": "am-ET-AmehaNeural"
|
57 |
+
},
|
58 |
+
"Arabic (Algeria)": {
|
59 |
+
"Female": "ar-DZ-AminaNeural",
|
60 |
+
"Male": "ar-DZ-IsmaelNeural"
|
61 |
+
},
|
62 |
+
"Arabic (Bahrain)": {
|
63 |
+
"Female": "ar-BH-LailaNeural",
|
64 |
+
"Male": "ar-BH-AliNeural"
|
65 |
+
},
|
66 |
+
"Arabic (Egypt)": {
|
67 |
+
"Female": "ar-EG-SalmaNeural",
|
68 |
+
"Male": "ar-EG-ShakirNeural"
|
69 |
+
},
|
70 |
+
"Arabic (Iraq)": {
|
71 |
+
"Female": "ar-IQ-RanaNeural",
|
72 |
+
"Male": "ar-IQ-BasselNeural"
|
73 |
+
},
|
74 |
+
"Arabic (Jordan)": {
|
75 |
+
"Female": "ar-JO-SanaNeural",
|
76 |
+
"Male": "ar-JO-TaimNeural"
|
77 |
+
},
|
78 |
+
"Arabic (Kuwait)": {
|
79 |
+
"Female": "ar-KW-NouraNeural",
|
80 |
+
"Male": "ar-KW-FahedNeural"
|
81 |
+
},
|
82 |
+
"Arabic (Lebanon)": {
|
83 |
+
"Female": "ar-LB-LaylaNeural",
|
84 |
+
"Male": "ar-LB-RamiNeural"
|
85 |
+
},
|
86 |
+
"Arabic (Libya)": {
|
87 |
+
"Female": "ar-LY-ImanNeural",
|
88 |
+
"Male": "ar-LY-OmarNeural"
|
89 |
+
},
|
90 |
+
"Arabic (Morocco)": {
|
91 |
+
"Female": "ar-MA-MounaNeural",
|
92 |
+
"Male": "ar-MA-JamalNeural"
|
93 |
+
},
|
94 |
+
"Arabic (Oman)": {
|
95 |
+
"Female": "ar-OM-AyshaNeural",
|
96 |
+
"Male": "ar-OM-SultanNeural"
|
97 |
+
},
|
98 |
+
"Arabic (Qatar)": {
|
99 |
+
"Female": "ar-QA-AmalNeural",
|
100 |
+
"Male": "ar-QA-MoazNeural"
|
101 |
+
},
|
102 |
+
"Arabic (Saudi Arabia)": {
|
103 |
+
"Female": "ar-SA-HodaNeural",
|
104 |
+
"Male": "ar-SA-FahdNeural"
|
105 |
+
},
|
106 |
+
"Arabic (Syria)": {
|
107 |
+
"Female": "ar-SY-AmanyNeural",
|
108 |
+
"Male": "ar-SY-LaithNeural"
|
109 |
+
},
|
110 |
+
"Arabic (Tunisia)": {
|
111 |
+
"Female": "ar-TN-ReemNeural",
|
112 |
+
"Male": "ar-TN-HediNeural"
|
113 |
+
},
|
114 |
+
"Arabic (UAE)": {
|
115 |
+
"Female": "ar-AE-FatimaNeural",
|
116 |
+
"Male": "ar-AE-HamdanNeural"
|
117 |
+
},
|
118 |
+
"Arabic (Yemen)": {
|
119 |
+
"Female": "ar-YE-MaryamNeural",
|
120 |
+
"Male": "ar-YE-SalehNeural"
|
121 |
+
},
|
122 |
+
"Armenian (Armenia)": {
|
123 |
+
"Female": "hy-AM-AnahitNeural",
|
124 |
+
"Male": "hy-AM-HaykNeural"
|
125 |
+
},
|
126 |
+
"Basque (Spain)": {
|
127 |
+
"Female": "eu-ES-AinhoaNeural",
|
128 |
+
"Male": "eu-ES-AnderNeural"
|
129 |
+
},
|
130 |
+
"Bengali (Bangladesh)": {
|
131 |
+
"Female": "bn-BD-NabanitaNeural",
|
132 |
+
"Male": "bn-BD-PradeepNeural"
|
133 |
+
},
|
134 |
+
"Bengali (India)": {
|
135 |
+
"Female": "bn-IN-BashantiNeural",
|
136 |
+
"Male": "bn-IN-TanishNeural"
|
137 |
+
},
|
138 |
+
"English (India)": {
|
139 |
+
"Female": "en-IN-NeerjaNeural",
|
140 |
+
"Male": "en-IN-PrabhatNeural"
|
141 |
+
},
|
142 |
+
"English (Australia)": {
|
143 |
+
"Female": "en-AU-NatashaNeural",
|
144 |
+
"Male": "en-AU-WilliamNeural"
|
145 |
+
},
|
146 |
+
"English (Canada)": {
|
147 |
+
"Female": "en-CA-ClaraNeural",
|
148 |
+
"Male": "en-CA-LiamNeural"
|
149 |
+
},
|
150 |
+
"English (Ireland)": {
|
151 |
+
"Female": "en-IE-EmilyNeural",
|
152 |
+
"Male": "en-IE-ConnorNeural"
|
153 |
+
},
|
154 |
+
"English (New Zealand)": {
|
155 |
+
"Female": "en-NZ-MollyNeural",
|
156 |
+
"Male": "en-NZ-MitchellNeural"
|
157 |
+
},
|
158 |
+
"English (South Africa)": {
|
159 |
+
"Female": "en-ZA-LeahNeural",
|
160 |
+
"Male": "en-ZA-LukeNeural"
|
161 |
+
},
|
162 |
+
"English (United Kingdom)": {
|
163 |
+
"Female": "en-GB-LibbyNeural",
|
164 |
+
"Male": "en-GB-RyanNeural"
|
165 |
+
},
|
166 |
+
"English (United States)": {
|
167 |
+
"Female": "en-US-JennyNeural",
|
168 |
+
"Male": "en-US-GuyNeural"
|
169 |
+
},
|
170 |
+
"English (Uganda)": {
|
171 |
+
"Female": "en-UG-EmilyNeural",
|
172 |
+
"Male": "en-UG-ConnorNeural"
|
173 |
+
},
|
174 |
+
"Bosnian (Bosnia and Herzegovina)": {
|
175 |
+
"Female": "bs-BA-VesnaNeural",
|
176 |
+
"Male": "bs-BA-GoranNeural"
|
177 |
+
},
|
178 |
+
"Bulgarian (Bulgaria)": {
|
179 |
+
"Female": "bg-BG-KalinaNeural",
|
180 |
+
"Male": "bg-BG-BorislavNeural"
|
181 |
+
},
|
182 |
+
"Catalan (Spain)": {
|
183 |
+
"Female": "ca-ES-AlbaNeural",
|
184 |
+
"Male": "ca-ES-EnricNeural"
|
185 |
+
},
|
186 |
+
"Chinese (Cantonese, Traditional)": {
|
187 |
+
"Female": "yue-HK-HiuGaaiNeural",
|
188 |
+
"Male": "yue-HK-WanLungNeural"
|
189 |
+
},
|
190 |
+
"Chinese (Mandarin, Simplified)": {
|
191 |
+
"Female": "zh-CN-XiaoxiaoNeural",
|
192 |
+
"Male": "zh-CN-YunxiNeural"
|
193 |
+
},
|
194 |
+
"Chinese (Mandarin, Traditional)": {
|
195 |
+
"Female": "zh-TW-HsiaoYuNeural",
|
196 |
+
"Male": "zh-TW-YunJheNeural"
|
197 |
+
},
|
198 |
+
"Croatian (Croatia)": {
|
199 |
+
"Female": "hr-HR-GabrijelaNeural",
|
200 |
+
"Male": "hr-HR-SreckoNeural"
|
201 |
+
},
|
202 |
+
"Czech (Czech Republic)": {
|
203 |
+
"Female": "cs-CZ-VlastaNeural",
|
204 |
+
"Male": "cs-CZ-AntoninNeural"
|
205 |
+
},
|
206 |
+
"Danish (Denmark)": {
|
207 |
+
"Female": "da-DK-ChristelNeural",
|
208 |
+
"Male": "da-DK-JeppeNeural"
|
209 |
+
},
|
210 |
+
"Dutch (Belgium)": {
|
211 |
+
"Female": "nl-BE-DenaNeural",
|
212 |
+
"Male": "nl-BE-ArnaudNeural"
|
213 |
+
},
|
214 |
+
"Dutch (Netherlands)": {
|
215 |
+
"Female": "nl-NL-ColetteNeural",
|
216 |
+
"Male": "nl-NL-MaartenNeural"
|
217 |
+
},
|
218 |
+
"Estonian (Estonia)": {
|
219 |
+
"Female": "et-EE-AnuNeural",
|
220 |
+
"Male": "et-EE-KertNeural"
|
221 |
+
},
|
222 |
+
"Filipino (Philippines)": {
|
223 |
+
"Female": "fil-PH-BlessicaNeural",
|
224 |
+
"Male": "fil-PH-AngeloNeural"
|
225 |
+
},
|
226 |
+
"Finnish (Finland)": {
|
227 |
+
"Female": "fi-FI-NooraNeural",
|
228 |
+
"Male": "fi-FI-HarriNeural"
|
229 |
+
},
|
230 |
+
"French (Belgium)": {
|
231 |
+
"Female": "fr-BE-CharlineNeural",
|
232 |
+
"Male": "fr-BE-GerardNeural"
|
233 |
+
},
|
234 |
+
"French (Canada)": {
|
235 |
+
"Female": "fr-CA-SylvieNeural",
|
236 |
+
"Male": "fr-CA-AntoineNeural"
|
237 |
+
},
|
238 |
+
"French (France)": {
|
239 |
+
"Female": "fr-FR-DeniseNeural",
|
240 |
+
"Male": "fr-FR-HenriNeural"
|
241 |
+
},
|
242 |
+
"Galician (Spain)": {
|
243 |
+
"Female": "gl-ES-RoiNeural",
|
244 |
+
"Male": "gl-ES-SabelaNeural"
|
245 |
+
},
|
246 |
+
"Georgian (Georgia)": {
|
247 |
+
"Female": "ka-GE-EkaNeural",
|
248 |
+
"Male": "ka-GE-GiorgiNeural"
|
249 |
+
},
|
250 |
+
"German (Austria)": {
|
251 |
+
"Female": "de-AT-IngridNeural",
|
252 |
+
"Male": "de-AT-JonasNeural"
|
253 |
+
},
|
254 |
+
"German (Germany)": {
|
255 |
+
"Female": "de-DE-KatjaNeural",
|
256 |
+
"Male": "de-DE-ConradNeural"
|
257 |
+
},
|
258 |
+
"German (Switzerland)": {
|
259 |
+
"Female": "de-CH-LeniNeural",
|
260 |
+
"Male": "de-CH-JanNeural"
|
261 |
+
},
|
262 |
+
"Indonesian (Indonesia)": {
|
263 |
+
"Female": "id-ID-GadisNeural",
|
264 |
+
"Male": "id-ID-ArdiNeural"
|
265 |
+
},
|
266 |
+
"Japanese (Japan)": {
|
267 |
+
"Female": "ja-JP-NanamiNeural",
|
268 |
+
"Male": "ja-JP-KeitaNeural"
|
269 |
+
},
|
270 |
+
}
|
271 |
+
|
272 |
+
|
requirements.txt
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
gradio~=5.16.0
|
2 |
+
requests~=2.32.3
|
3 |
+
yt-dlp~=2025.1.26
|
4 |
+
|
5 |
+
|