Musabbirkm commited on
Commit
9d1f8e0
·
1 Parent(s): be6dcb3

Add application file

Browse files
Files changed (7) hide show
  1. README.md +94 -14
  2. VOCALIS/__init__.py +2 -0
  3. VOCALIS/agent.py +5 -0
  4. VOCALIS/task.py +121 -0
  5. app.py +155 -0
  6. edgeTTsLang.py +272 -0
  7. requirements.txt +5 -0
README.md CHANGED
@@ -1,14 +1,94 @@
1
- ---
2
- title: ContentVoiceGen
3
- emoji: 🐠
4
- colorFrom: gray
5
- colorTo: indigo
6
- sdk: gradio
7
- sdk_version: 5.17.1
8
- app_file: app.py
9
- pinned: false
10
- license: apache-2.0
11
- short_description: AIpowered text-to-speech generator for storytelling, podcast
12
- ---
13
-
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🎙️ AI VoiceCraft: Text-to-Speech Studio 🚀
2
+
3
+ ## Overview
4
+
5
+ AI VoiceCraft is a powerful web application built with Gradio that leverages cutting-edge AI to generate dynamic text content and transform it into natural-sounding speech. This tool integrates the Gemini AI model for content generation and Microsoft Edge TTS for high-quality audio synthesis.
6
+
7
+ ## Features
8
+
9
+ - **Dynamic Content Generation:**
10
+ - Generate various content types, including stories, news, podcasts, and more.
11
+ - Customize content length, theme, and style.
12
+ - Utilize Gemini AI for creative and contextually relevant text output.
13
+ - **High-Quality Text-to-Speech:**
14
+ - Leverage Microsoft Edge TTS for realistic voice synthesis.
15
+ - Support for multiple languages and voices.
16
+ - Fine-tune speech rate and pitch for optimal delivery.
17
+ - **User-Friendly Interface:**
18
+ - Intuitive Gradio interface for easy navigation and control.
19
+ - Real-time feedback and error handling.
20
+ - Attractive theme applied for better user experience.
21
+ - **Customization Options:**
22
+ - Adjust the creativity level of the AI content generation.
23
+ - Input custom prompts for fine tuning the AI outputs.
24
+ - Adjust speech rate and pitch to fit your needs.
25
+
26
+ ## Getting Started
27
+
28
+ ### Prerequisites
29
+
30
+ - Python 3.7+
31
+ - Internet connection (for API access and TTS)
32
+ - API Key for Gemini Model.
33
+
34
+ ### Installation
35
+
36
+ 1. Clone the repository:
37
+
38
+ ```bash
39
+ git clone <repository_url>
40
+ cd <repository_directory>
41
+ ```
42
+
43
+ 2. Install the required Python packages:
44
+
45
+ ```bash
46
+ pip install gradio requests edge-tts google-generativeai nest_asyncio
47
+ ```
48
+ 3. set your API key in the VOCALIS.py file.
49
+ 4. Run the application:
50
+
51
+ ```bash
52
+ python app.py
53
+ ```
54
+
55
+
56
+ 5. Open your web browser and navigate to the local URL provided by Gradio (usually `http://127.0.0.1:7860`).
57
+
58
+ ## Usage
59
+
60
+ 1. Select the desired content type from the dropdown menu.
61
+ 2. Choose the language and voice for the TTS output.
62
+ 3. Adjust the output style, content length, and theme as needed.
63
+ 4. Enter any custom text or instructions in the customization field.
64
+ 5. Adjust the speech rate and pitch using the sliders.
65
+ 6. Click the "Submit" button to generate the text and audio.
66
+ 7. Review the generated text and listen to the audio output.
67
+
68
+ ## Code Structure
69
+
70
+ - `your_script_name.py`: Main application script that integrates Gradio, content generation, and TTS.
71
+ - `VOCALIS.py`: Contains the `Agent` and `ContentGenerator` classes for AI content generation.
72
+ - `edgeTTsLang.py`: Dictionary containing the language and voice codes for Microsoft Edge TTS.
73
+
74
+ ## Dependencies
75
+
76
+ - `gradio`: For building the web interface.
77
+ - `requests`: For making HTTP requests to the API.
78
+ - `edge-tts`: For text-to-speech conversion.
79
+ - `google-generativeai`: For interacting with the Gemini AI model.
80
+ - `asyncio`: For asynchronous operations.
81
+ - `nest_asyncio`: For handling nested asyncio events in Jupyter notebooks.
82
+
83
+ ## Contributing
84
+
85
+ Contributions are welcome! Please feel free to submit pull requests or open issues for bug fixes, feature requests, or improvements.
86
+
87
+ ## License
88
+
89
+ This project is licensed under the MIT License.
90
+
91
+ ## Gradio Theme
92
+
93
+ To enhance the user experience, an attractive theme has been applied to the Gradio interface. You can customize the theme further by modifying the Gradio theme settings in the `create_demo` function.
94
+
VOCALIS/__init__.py ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ from .agent import Agent
2
+ from .task import ContentGenerator
VOCALIS/agent.py ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ class Agent:
2
+ def __init__(self, model: str, temperature: float = 0.6, role: str = "Content Creator"):
3
+ self.model = model
4
+ self.temperature = temperature
5
+ self.role = role
VOCALIS/task.py ADDED
@@ -0,0 +1,121 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from VOCALIS import Agent
2
+ import os
3
+ import logging
4
+ import re
5
+ import google.generativeai as genai
6
+ from google.generativeai.types import GenerationConfig
7
+
8
+
9
+ # Configure Gemini AI API
10
+ api_key = os.getenv("API_KEY")
11
+ if not api_key:
12
+ raise ValueError("API Key is missing. Set the API_KEY environment variable.")
13
+ genai.configure(api_key=api_key)
14
+
15
+ logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
16
+
17
+
18
+ class ContentGenerator:
19
+ def __init__(self, agent: Agent, content_type: str = "story", language: str = "English",content_length: int = 200,
20
+ theme: str = "General/None", expectations: str = ""):
21
+ self.agent = agent
22
+ self.content_type = content_type.strip().lower()
23
+ self.language = language.strip()
24
+ self.goal = self._get_default_goal()
25
+ self.content_length = content_length # Added content length
26
+ self.theme = theme.strip()
27
+ self.expectations = expectations.strip()
28
+
29
+ # Input validation
30
+ if self.content_type not in [
31
+ "story", "social", "news", "motivational", "explainer", "advertisement", "interview", "podcast",
32
+ "testimonial", "comedy", "audiobook", "documentary", "meditation", "education", "poem", "recipe", "script",
33
+ "summary", "email", "blog"
34
+ ]:
35
+ raise ValueError(f"Invalid content type: {self.content_type}")
36
+ # if self.language not in languages:
37
+ # raise ValueError(f"Invalid language: {self.language}")
38
+
39
+ def _get_default_goal(self) -> str:
40
+ default_goals = {
41
+ "story": "Generate a vivid, engaging, and natural-sounding short story suitable for narration.",
42
+ "social": "Create a casual, engaging, and conversational social media script that sounds authentic.",
43
+ "news": "Write a professional and well-structured news report optimized for audio presentation.",
44
+ "motivational": "Generate an inspiring and natural motivational speech with a strong emotional connection.",
45
+ "explainer": "Break down a complex topic in a clear and engaging way, suitable for an audio explanation.",
46
+ "advertisement": "Write a persuasive and compelling ad script that feels engaging and natural.",
47
+ "interview": "Generate a structured, conversational interview with natural question-answer flow.",
48
+ "podcast": "Write a structured podcast script with natural dialogue and engaging discussions.",
49
+ "testimonial": "Create an authentic-sounding customer testimonial suitable for an audio review.",
50
+ "comedy": "Write a humorous monologue or short sketch with a natural comedic timing.",
51
+ "audiobook": "Generate a structured audiobook chapter with expressive dialogue and immersive narration.",
52
+ "documentary": "Create a professional and informative documentary narration with a storytelling approach.",
53
+ "meditation": "Write a soothing guided meditation script designed for relaxation and mindfulness.",
54
+ "education": "Generate a structured and clear educational script that is easy to follow in an audio format.",
55
+ "poem": "Generate a beautiful and expressive poem with a natural flow.",
56
+ "recipe": "Write a clear and easy-to-follow recipe suitable for audio instructions.",
57
+ "script": "Generate a well-structured script for a short video or audio segment.",
58
+ "summary": "Create a concise and accurate summary of a given topic.",
59
+ "email": "Write a professional and well-formatted email.",
60
+ "blog": "Generate an engaging and informative blog post."
61
+ }
62
+ return default_goals.get(self.content_type,
63
+ "Generate a vivid, engaging, and natural-sounding short story suitable for narration.")
64
+
65
+ def _build_prompt(self) -> str:
66
+ prompt = (
67
+ f"Role: You are a professional voice-over script writer specializing in {self.content_type} generation for natural speech synthesis.\n"
68
+ f"Task: Create a high-quality, natural-sounding script in {self.language} optimized for text-to-speech (TTS).\n"
69
+ f"Tone: Maintain a conversational and engaging tone, as if speaking directly to a listener.\n"
70
+ f"Structure: Use short, clear sentences. Organize the content into logical paragraphs for easy audio comprehension.\n"
71
+ f"Goal: {self.goal}\n"
72
+ f"Constraints:\n"
73
+ f"- Keep the script under {self.content_length} words.\n"
74
+ f"- Use simple, direct language. Avoid complex jargon or unusual words that may be mispronounced by TTS.\n"
75
+ f"- Do not explicitly state the content type (e.g., 'This is a story', 'Here is a script for voice-over...', etc.).\n"
76
+ f"- Avoid excessive use of abbreviations, as they may not be pronounced correctly by TTS.\n"
77
+ f"- Ensure smooth sentence transitions to maintain a natural flow when spoken aloud.\n"
78
+
79
+ f"Instructions for Natural Pacing and Pauses:\n"
80
+ f"- Use punctuation strategically (commas, ellipses, and dashes) to guide pauses in speech.\n"
81
+ f"- Insert line breaks between key ideas to improve speech rhythm and avoid monotony.\n"
82
+ f"- Break down long sentences into shorter, more digestible phrases to improve clarity.\n"
83
+
84
+ f"Instructions for Emphasis:\n"
85
+ f"- Use ALL CAPS or spacing between letters for words that should be emphasized.\n"
86
+ f"- Provide phonetic hints for difficult or unusual words if necessary.\n"
87
+
88
+ f"Output:\n"
89
+ f"- Return ONLY the generated script. Do not include any introductory phrases like 'Here is a script...' or explanations.\n"
90
+ )
91
+
92
+ if self.theme and self.theme != "General/None":
93
+ prompt += f"Theme/Nature: {self.theme}\n"
94
+
95
+ if self.expectations:
96
+ prompt += f"User Expectations: {self.expectations}\n"
97
+
98
+ return prompt
99
+
100
+
101
+
102
+ def generate_content(self) -> str:
103
+ try:
104
+ model = genai.GenerativeModel(self.agent.model)
105
+ prompt = self._build_prompt()
106
+ contents = [{"parts": [{"text": prompt}]}]
107
+ generation_config = GenerationConfig(temperature=self.agent.temperature, max_output_tokens=1024)
108
+
109
+ response = model.generate_content(contents=contents, generation_config=generation_config)
110
+ output = response.text
111
+
112
+ output = output.strip()
113
+ output = re.sub(r'\s+', ' ', output)
114
+
115
+ return output
116
+
117
+ except Exception as e:
118
+ logging.error(f"Error generating content: {e}")
119
+ return f"Generation failed: {e}"
120
+
121
+
app.py ADDED
@@ -0,0 +1,155 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import asyncio
3
+ import tempfile
4
+ import logging
5
+ import requests
6
+ from VOCALIS import Agent, ContentGenerator
7
+ from edgeTTsLang import languages
8
+
9
+
10
+ logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
11
+ logger = logging.getLogger(__name__)
12
+
13
+ def generate_the_content(content_type, language,output_style,content_length, theme, expectations):
14
+ try:
15
+ temperature_map = {
16
+ "Precise (Deterministic)": 0.1,
17
+ "Very Focused (Low Randomness)": 0.3,
18
+ "Moderately Focused (Slight Randomness)": 0.4,
19
+ "Balanced (Moderate Creativity)": 0.5,
20
+ "Slightly Creative (Moderate Randomness)": 0.6,
21
+ "Creative (High Randomness)": 0.7,
22
+ "Highly Creative (Very High Randomness)": 0.8,
23
+ "Experimental (Maximum Randomness)": 0.95,
24
+ }
25
+ temperature = temperature_map.get(output_style, 0.6)
26
+ agent = Agent(model="gemini-2.0-flash", temperature=temperature)
27
+ generator = ContentGenerator(agent, content_type, language, content_length, theme, expectations)
28
+ output = generator.generate_content()
29
+
30
+ return output
31
+
32
+ except ValueError as ve:
33
+ return f"Input Error: {ve}"
34
+ except requests.exceptions.ConnectionError:
35
+ return "Network Error: Could not connect to API. Please check your internet connection."
36
+ except Exception as e:
37
+ return f"General Error: {e}"
38
+
39
+ async def text_to_speech(text, voice, rate, pitch):
40
+ import edge_tts
41
+ if not text.strip():
42
+ return None, "Please enter text to convert."
43
+ if not voice:
44
+ return None, "Please select a voice."
45
+ rate_str = f"{rate:+d}%"
46
+ pitch_str = f"{pitch:+d}Hz"
47
+ communicate = edge_tts.Communicate(text, voice, rate=rate_str, pitch=pitch_str)
48
+ with tempfile.NamedTemporaryFile(delete=False, suffix=".mp3") as tmp_file:
49
+ tmp_path = tmp_file.name
50
+ await communicate.save(tmp_path)
51
+ return tmp_path, None
52
+
53
+ async def tts_interface(content_type, language, voice, output_style, content_length, theme, Customization, rate, pitch):
54
+ text_output = generate_the_content(content_type, language, output_style, content_length, theme, Customization)
55
+ if text_output.startswith("Error:"):
56
+ return None, None, gr.Markdown(text_output)
57
+
58
+ audio_file, warning = await text_to_speech(text_output, languages[language][voice], rate, pitch)
59
+
60
+ if warning:
61
+ return text_output, gr.Markdown(warning)
62
+
63
+ return text_output, audio_file, None
64
+
65
+ def create_demo():
66
+ language_choices = list(languages.keys())
67
+
68
+ custom_theme = gr.themes.Soft(
69
+ primary_hue="indigo",
70
+ secondary_hue="blue",
71
+ neutral_hue="slate",
72
+ radius_size=gr.themes.sizes.radius_sm,
73
+ font=[gr.themes.GoogleFont("Montserrat"), "Arial", "sans-serif"],
74
+ )
75
+
76
+ demo = gr.Interface(
77
+ fn=tts_interface,
78
+ theme=custom_theme,
79
+ inputs=[
80
+ gr.Dropdown(label="Content Type", choices=[
81
+ "story", "social", "news", "motivational", "explainer", "advertisement", "interview", "podcast",
82
+ "testimonial", "comedy", "audiobook", "documentary", "meditation", "education", "poem", "recipe",
83
+ "script", "summary", "email", "blog"
84
+ ], value="story"),
85
+ gr.Dropdown(label="Language", choices=language_choices, value=language_choices[0] if language_choices else ""),
86
+ gr.Dropdown(label="Voice", choices=["Female", "Male"], value="Female"),
87
+ gr.Dropdown(label="Output Style", choices=[
88
+ "Precise (Deterministic)", "Very Focused (Low Randomness)", "Moderately Focused (Slight Randomness)",
89
+ "Balanced (Moderate Creativity)", "Slightly Creative (Moderate Randomness)",
90
+ "Creative (High Randomness)", "Highly Creative (Very High Randomness)",
91
+ "Experimental (Maximum Randomness)"
92
+ ], value="Balanced (Moderate Creativity)"),
93
+ gr.Slider(label="Content Length (Words)", minimum=100, maximum=1000, value=200, step=10),
94
+ gr.Dropdown(label="Theme/Nature (Optional)", choices=[
95
+ "General/None", "Narrative/Storytelling", "Informative/Educational", "Descriptive/Atmospheric",
96
+ "Persuasive/Argumentative", "Humorous/Comedic", "Emotional/Inspirational", "Technical/Scientific",
97
+ "Historical/Cultural", "Modern/Contemporary", "Futuristic/Sci-Fi", "Fantasy/Mythical",
98
+ "Mystery/Suspense", "Adventure/Exploration", "Realistic/Documentary", "Philosophical/Reflective",
99
+ "Social/Relational", "Environmental/Nature", "Personal/Anecdotal"
100
+ ], value="General/None"),
101
+ gr.Textbox(label="Customization", placeholder="Add any extra information to help customize the generated content"),
102
+ gr.Slider(minimum=-50, maximum=50, value=0, label="Speech Rate Adjustment (%)", step=1),
103
+ gr.Slider(minimum=-20, maximum=20, value=0, label="Pitch Adjustment (Hz)", step=1)
104
+ ],
105
+ outputs=[
106
+ gr.Textbox(label="Generated Text"),
107
+ gr.Audio(label="Generated Audio", type="filepath"),
108
+ gr.Markdown(label="Error/Warning", visible=True)
109
+ ],
110
+ title="✨ AI VoiceCraft: Text-to-Speech Studio 🎙️",
111
+ description="""
112
+ 🚀 Transform your text into captivating audio! 🚀
113
+
114
+ This tool generates AI-powered content and converts it into lifelike speech using Microsoft Edge TTS.
115
+
116
+ 🔹 **Features at a Glance:**
117
+ 🌍 Supports multiple languages and voices
118
+ 🎚️ Adjust speech rate and pitch for natural delivery
119
+ 📝 Generate dynamic content: stories, news, podcasts & more
120
+ 🎭 Customize tone, length, and style to fit your needs
121
+
122
+ """,
123
+ article="""
124
+ # 🌟 Welcome to AI VoiceCraft! 🌟
125
+
126
+ **Unleash the power of AI-driven text-to-speech.**
127
+
128
+ This advanced application blends **cutting-edge AI content generation** with high-quality speech synthesis to create immersive audio experiences.
129
+
130
+ ## 🎤 Key Highlights:
131
+ 🔊 Natural and expressive voice output
132
+ 📖 AI-powered script generation tailored for speech
133
+ ⚙️ Fine-tune pitch, rate, and delivery style
134
+
135
+ 🔗 [Discover more AI tools@MusabbirKM](https://www.example.com/ai-tools)
136
+ """,
137
+
138
+ allow_flagging="never",
139
+ api_name=None,
140
+ )
141
+ return demo
142
+
143
+ async def main():
144
+ demo = create_demo()
145
+ demo.queue(default_concurrency_limit=5)
146
+ demo.launch(show_api=False)
147
+
148
+
149
+ if __name__ == "__main__":
150
+ try:
151
+ asyncio.run(main())
152
+ except RuntimeError:
153
+ import nest_asyncio
154
+ nest_asyncio.apply()
155
+ asyncio.run(main())
edgeTTsLang.py ADDED
@@ -0,0 +1,272 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ languages = {
2
+ "Malayalam (India)": {
3
+ "Female": "ml-IN-SobhanaNeural",
4
+ "Male": "ml-IN-MidhunNeural"
5
+ },
6
+ "Hindi (India)": {
7
+ "Female": "hi-IN-SwaraNeural",
8
+ "Male": "hi-IN-MadhurNeural"
9
+ },
10
+ "Kannada (India)": {
11
+ "Female": "kn-IN-SapnaNeural",
12
+ "Male": "kn-IN-GaganNeural"
13
+ },
14
+ "Tamil (India)": {
15
+ "Female": "ta-IN-PallaviNeural",
16
+ "Male": "ta-IN-ValluvarNeural"
17
+ },
18
+ "Telugu (India)": {
19
+ "Female": "te-IN-ShrutiNeural",
20
+ "Male": "te-IN-MohanNeural"
21
+ },
22
+ "Urdu (India)": {
23
+ "Female": "ur-IN-GulNeural",
24
+ "Male": "ur-IN-SarfarazNeural"
25
+ },
26
+ "Gujarati (India)": {
27
+ "Female": "gu-IN-DhwaniNeural",
28
+ "Male": "gu-IN-NiranjanNeural"
29
+ },
30
+ "Marathi (India)": {
31
+ "Female": "mr-IN-AarohiNeural",
32
+ "Male": "mr-IN-ManoharNeural"
33
+ },
34
+ "Odia (India)": {
35
+ "Female": "or-IN-TariniNeural",
36
+ "Male": "or-IN-BiswajitNeural"
37
+ },
38
+ "Punjabi (India)": {
39
+ "Female": "pa-IN-GagandeepNeural",
40
+ "Male": "pa-IN-NirvairNeural"
41
+ },
42
+ "Assamese (India)": {
43
+ "Female": "as-IN-PariNeural",
44
+ "Male": "as-IN-NiloyNeural"
45
+ },
46
+ "Afrikaans (South Africa)": {
47
+ "Female": "af-ZA-AdriNeural",
48
+ "Male": "af-ZA-WillemNeural"
49
+ },
50
+ "Albanian (Albania)": {
51
+ "Female": "sq-AL-AnilaNeural",
52
+ "Male": "sq-AL-IlirNeural"
53
+ },
54
+ "Amharic (Ethiopia)": {
55
+ "Female": "am-ET-MekdesNeural",
56
+ "Male": "am-ET-AmehaNeural"
57
+ },
58
+ "Arabic (Algeria)": {
59
+ "Female": "ar-DZ-AminaNeural",
60
+ "Male": "ar-DZ-IsmaelNeural"
61
+ },
62
+ "Arabic (Bahrain)": {
63
+ "Female": "ar-BH-LailaNeural",
64
+ "Male": "ar-BH-AliNeural"
65
+ },
66
+ "Arabic (Egypt)": {
67
+ "Female": "ar-EG-SalmaNeural",
68
+ "Male": "ar-EG-ShakirNeural"
69
+ },
70
+ "Arabic (Iraq)": {
71
+ "Female": "ar-IQ-RanaNeural",
72
+ "Male": "ar-IQ-BasselNeural"
73
+ },
74
+ "Arabic (Jordan)": {
75
+ "Female": "ar-JO-SanaNeural",
76
+ "Male": "ar-JO-TaimNeural"
77
+ },
78
+ "Arabic (Kuwait)": {
79
+ "Female": "ar-KW-NouraNeural",
80
+ "Male": "ar-KW-FahedNeural"
81
+ },
82
+ "Arabic (Lebanon)": {
83
+ "Female": "ar-LB-LaylaNeural",
84
+ "Male": "ar-LB-RamiNeural"
85
+ },
86
+ "Arabic (Libya)": {
87
+ "Female": "ar-LY-ImanNeural",
88
+ "Male": "ar-LY-OmarNeural"
89
+ },
90
+ "Arabic (Morocco)": {
91
+ "Female": "ar-MA-MounaNeural",
92
+ "Male": "ar-MA-JamalNeural"
93
+ },
94
+ "Arabic (Oman)": {
95
+ "Female": "ar-OM-AyshaNeural",
96
+ "Male": "ar-OM-SultanNeural"
97
+ },
98
+ "Arabic (Qatar)": {
99
+ "Female": "ar-QA-AmalNeural",
100
+ "Male": "ar-QA-MoazNeural"
101
+ },
102
+ "Arabic (Saudi Arabia)": {
103
+ "Female": "ar-SA-HodaNeural",
104
+ "Male": "ar-SA-FahdNeural"
105
+ },
106
+ "Arabic (Syria)": {
107
+ "Female": "ar-SY-AmanyNeural",
108
+ "Male": "ar-SY-LaithNeural"
109
+ },
110
+ "Arabic (Tunisia)": {
111
+ "Female": "ar-TN-ReemNeural",
112
+ "Male": "ar-TN-HediNeural"
113
+ },
114
+ "Arabic (UAE)": {
115
+ "Female": "ar-AE-FatimaNeural",
116
+ "Male": "ar-AE-HamdanNeural"
117
+ },
118
+ "Arabic (Yemen)": {
119
+ "Female": "ar-YE-MaryamNeural",
120
+ "Male": "ar-YE-SalehNeural"
121
+ },
122
+ "Armenian (Armenia)": {
123
+ "Female": "hy-AM-AnahitNeural",
124
+ "Male": "hy-AM-HaykNeural"
125
+ },
126
+ "Basque (Spain)": {
127
+ "Female": "eu-ES-AinhoaNeural",
128
+ "Male": "eu-ES-AnderNeural"
129
+ },
130
+ "Bengali (Bangladesh)": {
131
+ "Female": "bn-BD-NabanitaNeural",
132
+ "Male": "bn-BD-PradeepNeural"
133
+ },
134
+ "Bengali (India)": {
135
+ "Female": "bn-IN-BashantiNeural",
136
+ "Male": "bn-IN-TanishNeural"
137
+ },
138
+ "English (India)": {
139
+ "Female": "en-IN-NeerjaNeural",
140
+ "Male": "en-IN-PrabhatNeural"
141
+ },
142
+ "English (Australia)": {
143
+ "Female": "en-AU-NatashaNeural",
144
+ "Male": "en-AU-WilliamNeural"
145
+ },
146
+ "English (Canada)": {
147
+ "Female": "en-CA-ClaraNeural",
148
+ "Male": "en-CA-LiamNeural"
149
+ },
150
+ "English (Ireland)": {
151
+ "Female": "en-IE-EmilyNeural",
152
+ "Male": "en-IE-ConnorNeural"
153
+ },
154
+ "English (New Zealand)": {
155
+ "Female": "en-NZ-MollyNeural",
156
+ "Male": "en-NZ-MitchellNeural"
157
+ },
158
+ "English (South Africa)": {
159
+ "Female": "en-ZA-LeahNeural",
160
+ "Male": "en-ZA-LukeNeural"
161
+ },
162
+ "English (United Kingdom)": {
163
+ "Female": "en-GB-LibbyNeural",
164
+ "Male": "en-GB-RyanNeural"
165
+ },
166
+ "English (United States)": {
167
+ "Female": "en-US-JennyNeural",
168
+ "Male": "en-US-GuyNeural"
169
+ },
170
+ "English (Uganda)": {
171
+ "Female": "en-UG-EmilyNeural",
172
+ "Male": "en-UG-ConnorNeural"
173
+ },
174
+ "Bosnian (Bosnia and Herzegovina)": {
175
+ "Female": "bs-BA-VesnaNeural",
176
+ "Male": "bs-BA-GoranNeural"
177
+ },
178
+ "Bulgarian (Bulgaria)": {
179
+ "Female": "bg-BG-KalinaNeural",
180
+ "Male": "bg-BG-BorislavNeural"
181
+ },
182
+ "Catalan (Spain)": {
183
+ "Female": "ca-ES-AlbaNeural",
184
+ "Male": "ca-ES-EnricNeural"
185
+ },
186
+ "Chinese (Cantonese, Traditional)": {
187
+ "Female": "yue-HK-HiuGaaiNeural",
188
+ "Male": "yue-HK-WanLungNeural"
189
+ },
190
+ "Chinese (Mandarin, Simplified)": {
191
+ "Female": "zh-CN-XiaoxiaoNeural",
192
+ "Male": "zh-CN-YunxiNeural"
193
+ },
194
+ "Chinese (Mandarin, Traditional)": {
195
+ "Female": "zh-TW-HsiaoYuNeural",
196
+ "Male": "zh-TW-YunJheNeural"
197
+ },
198
+ "Croatian (Croatia)": {
199
+ "Female": "hr-HR-GabrijelaNeural",
200
+ "Male": "hr-HR-SreckoNeural"
201
+ },
202
+ "Czech (Czech Republic)": {
203
+ "Female": "cs-CZ-VlastaNeural",
204
+ "Male": "cs-CZ-AntoninNeural"
205
+ },
206
+ "Danish (Denmark)": {
207
+ "Female": "da-DK-ChristelNeural",
208
+ "Male": "da-DK-JeppeNeural"
209
+ },
210
+ "Dutch (Belgium)": {
211
+ "Female": "nl-BE-DenaNeural",
212
+ "Male": "nl-BE-ArnaudNeural"
213
+ },
214
+ "Dutch (Netherlands)": {
215
+ "Female": "nl-NL-ColetteNeural",
216
+ "Male": "nl-NL-MaartenNeural"
217
+ },
218
+ "Estonian (Estonia)": {
219
+ "Female": "et-EE-AnuNeural",
220
+ "Male": "et-EE-KertNeural"
221
+ },
222
+ "Filipino (Philippines)": {
223
+ "Female": "fil-PH-BlessicaNeural",
224
+ "Male": "fil-PH-AngeloNeural"
225
+ },
226
+ "Finnish (Finland)": {
227
+ "Female": "fi-FI-NooraNeural",
228
+ "Male": "fi-FI-HarriNeural"
229
+ },
230
+ "French (Belgium)": {
231
+ "Female": "fr-BE-CharlineNeural",
232
+ "Male": "fr-BE-GerardNeural"
233
+ },
234
+ "French (Canada)": {
235
+ "Female": "fr-CA-SylvieNeural",
236
+ "Male": "fr-CA-AntoineNeural"
237
+ },
238
+ "French (France)": {
239
+ "Female": "fr-FR-DeniseNeural",
240
+ "Male": "fr-FR-HenriNeural"
241
+ },
242
+ "Galician (Spain)": {
243
+ "Female": "gl-ES-RoiNeural",
244
+ "Male": "gl-ES-SabelaNeural"
245
+ },
246
+ "Georgian (Georgia)": {
247
+ "Female": "ka-GE-EkaNeural",
248
+ "Male": "ka-GE-GiorgiNeural"
249
+ },
250
+ "German (Austria)": {
251
+ "Female": "de-AT-IngridNeural",
252
+ "Male": "de-AT-JonasNeural"
253
+ },
254
+ "German (Germany)": {
255
+ "Female": "de-DE-KatjaNeural",
256
+ "Male": "de-DE-ConradNeural"
257
+ },
258
+ "German (Switzerland)": {
259
+ "Female": "de-CH-LeniNeural",
260
+ "Male": "de-CH-JanNeural"
261
+ },
262
+ "Indonesian (Indonesia)": {
263
+ "Female": "id-ID-GadisNeural",
264
+ "Male": "id-ID-ArdiNeural"
265
+ },
266
+ "Japanese (Japan)": {
267
+ "Female": "ja-JP-NanamiNeural",
268
+ "Male": "ja-JP-KeitaNeural"
269
+ },
270
+ }
271
+
272
+
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ gradio~=5.16.0
2
+ requests~=2.32.3
3
+ yt-dlp~=2025.1.26
4
+
5
+