dcrey7 commited on
Commit
7ee5fa7
Β·
1 Parent(s): 12c48e3

version1tutorgradio

Browse files
Files changed (6) hide show
  1. .gitignore +56 -0
  2. README.md +51 -7
  3. app.py +765 -0
  4. pyproject.toml +17 -0
  5. requirements.txt +9 -0
  6. test.py +128 -0
.gitignore ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Environment variables
2
+ .env
3
+ .env.*
4
+ !.env.example
5
+
6
+ # Python cache
7
+ __pycache__/
8
+ *.py[cod]
9
+ *$py.class
10
+ *.so
11
+
12
+ # Distribution / packaging
13
+ .Python
14
+ build/
15
+ develop-eggs/
16
+ dist/
17
+ downloads/
18
+ eggs/
19
+ .eggs/
20
+ lib/
21
+ lib64/
22
+ parts/
23
+ sdist/
24
+ var/
25
+ wheels/
26
+ *.egg-info/
27
+ .installed.cfg
28
+ *.egg
29
+
30
+ # Virtual environments
31
+ venv/
32
+ ENV/
33
+ env/
34
+ .venv
35
+
36
+ # IDE and OS files
37
+ .vscode/
38
+ .idea/
39
+ *.DS_Store
40
+ .DS_Store?
41
+ ._*
42
+ .Spotlight-V100
43
+ .Trashes
44
+ Thumbs.db
45
+
46
+ # Gradio
47
+ gradio_cached_examples/
48
+ flagged/
49
+
50
+ # Audio files (temporary)
51
+ *.mp3
52
+ *.wav
53
+ *.ogg
54
+
55
+ # uv
56
+ .uv/
README.md CHANGED
@@ -1,13 +1,57 @@
1
  ---
2
- title: Mr Misrtral
3
- emoji: πŸ“Š
4
- colorFrom: purple
5
- colorTo: indigo
6
  sdk: gradio
7
- sdk_version: 5.36.2
8
  app_file: app.py
9
  pinned: false
10
- short_description: french language tutot
 
 
 
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: French Tutor
3
+ emoji: πŸ‡«πŸ‡·
4
+ colorFrom: blue
5
+ colorTo: red
6
  sdk: gradio
7
+ sdk_version: 4.31.0
8
  app_file: app.py
9
  pinned: false
10
+ secrets:
11
+ - MISTRAL_API_KEY
12
+ - GEMINI_API_KEY
13
+ - GROQ_API_KEY
14
  ---
15
 
16
+ # πŸ‡«πŸ‡· French Conversation Tutor
17
+
18
+ This is a Gradio app for practicing French conversation through natural speech interaction with Mr. Mistral!
19
+
20
+ **How it works:**
21
+ 1. The app gives you a scenario and some helpful phrases.
22
+ 2. You record your response in French using your microphone.
23
+ 3. The AI tutor (Mr. Mistral) replies, and you have a short conversation (3 exchanges).
24
+ 4. At the end, you get a detailed analysis of your grammar, pronunciation, and vocabulary.
25
+
26
+ **Tech Stack:**
27
+ - **Primary LLM:** Mistral AI (mistral-large-latest)
28
+ - **Fallback LLM:** Google Gemini API (gemini-1.5-flash-latest)
29
+ - **STT:** Groq API (Whisper large-v3-turbo)
30
+ - **TTS:** Groq API (TTS-1 model)
31
+ - **UI:** Gradio
32
+
33
+ The app will show which LLM is being used in the interface. It prioritizes Mistral AI, but falls back to Google Gemini if Mistral is unavailable.
34
+
35
+ **Features:**
36
+ - 🎀 Speech-to-text transcription of French audio
37
+ - πŸ”Š Text-to-speech for AI tutor responses
38
+ - πŸ’¬ Natural conversation flow
39
+ - πŸ“Š Detailed analysis after 3 exchanges
40
+ - 🎲 Random topic generation for variety
41
+ - πŸ€– Model indicator showing which AI is responding
42
+
43
+ **Setup:**
44
+ 1. Create a `.env` file with:
45
+ ```
46
+ MISTRAL_API_KEY=your_mistral_api_key
47
+ GEMINI_API_KEY=your_gemini_api_key # Optional fallback
48
+ GROQ_API_KEY=your_groq_api_key
49
+ ```
50
+ **Important:** Never commit your `.env` file to version control!
51
+
52
+ 2. Install dependencies: `pip install -r requirements.txt`
53
+ 3. Test your API keys: `python test_api_keys.py`
54
+ 4. Run: `python app.py`
55
+
56
+ **For HuggingFace Spaces:**
57
+ - Add `MISTRAL_API_KEY`, `GEMINI_API_KEY` (optional), and `GROQ_API_KEY` in the Space's Settings > Variables and secrets
app.py ADDED
@@ -0,0 +1,765 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ French Conversation Tutor - Main Application
3
+ Practice French through natural conversation with Mr. Mistral!
4
+ """
5
+
6
+ import gradio as gr
7
+ import numpy as np
8
+ import os
9
+ import io
10
+ import wave
11
+ import tempfile
12
+ import time
13
+ from datetime import datetime
14
+ from typing import List, Dict, Tuple
15
+ import re
16
+ import random
17
+ import shutil
18
+ from dotenv import load_dotenv
19
+ import soundfile as sf # Added missing import
20
+
21
+ # Load environment variables
22
+ load_dotenv()
23
+
24
+ # Model imports
25
+ from mistralai import Mistral
26
+ import google.generativeai as genai
27
+ from groq import Groq
28
+ import openai
29
+
30
+ # Load API keys
31
+ mistral_api_key = os.environ.get("MISTRAL_API_KEY")
32
+ gemini_api_key = os.environ.get("GEMINI_API_KEY")
33
+ groq_api_key = os.environ.get("GROQ_API_KEY")
34
+
35
+ # Debug: Check if keys are loaded
36
+ print(f"Mistral API key loaded: {'Yes' if mistral_api_key else 'No'}")
37
+ print(f"Gemini API key loaded: {'Yes' if gemini_api_key else 'No'}")
38
+ print(f"Groq API key loaded: {'Yes' if groq_api_key else 'No'}")
39
+ print(f"OpenAI API key loaded: {'Yes' if openai_api_key else 'No'}")
40
+
41
+ # Initialize clients
42
+ mistral_client = None
43
+ if mistral_api_key:
44
+ mistral_client = Mistral(api_key=mistral_api_key)
45
+ current_llm = "Mistral AI"
46
+ elif gemini_api_key:
47
+ genai.configure(api_key=gemini_api_key)
48
+ current_llm = "Google Gemini (Fallback)"
49
+ else:
50
+ raise ValueError("Neither MISTRAL_API_KEY nor GEMINI_API_KEY found in environment variables.")
51
+
52
+ # Initialize Gemini for fallback even if Mistral is primary
53
+ if gemini_api_key and mistral_api_key:
54
+ genai.configure(api_key=gemini_api_key)
55
+
56
+ if not groq_api_key:
57
+ raise ValueError("GROQ_API_KEY not found in environment variables.")
58
+ groq_client = Groq(api_key=groq_api_key)
59
+
60
+ # Global list to track temp files (to prevent deletion before serving)
61
+ temp_audio_files = []
62
+ current_llm = "Unknown" # Track which LLM is being used
63
+
64
+ def cleanup_old_audio_files():
65
+ global temp_audio_files
66
+ # Keep more files and add delay to avoid deleting files being served
67
+ if len(temp_audio_files) > 20: # Increased from 10 to 20
68
+ old_files = temp_audio_files[:-20]
69
+ for file_path in old_files:
70
+ try:
71
+ # Check if file is older than 60 seconds before deleting
72
+ if os.path.exists(file_path):
73
+ file_age = datetime.now().timestamp() - os.path.getmtime(file_path)
74
+ if file_age > 60: # Only delete files older than 60 seconds
75
+ os.remove(file_path)
76
+ temp_audio_files.remove(file_path)
77
+ except:
78
+ pass
79
+
80
+ def get_system_prompt():
81
+ return """You are Mr. Mistral, a French tutor having a conversation with ONE student.
82
+
83
+ CRITICAL: You are ONLY the tutor. The student will speak to you, and you respond ONLY to what they actually said.
84
+
85
+ NEVER:
86
+ - Create dialogue for the student
87
+ - Imagine what the student might say
88
+ - Write "You:" or "Student:" or any dialogue
89
+ - Continue the conversation by yourself
90
+
91
+ ALWAYS:
92
+ - Wait for the student's actual input
93
+ - Respond with ONE French sentence only
94
+ - Use exactly 3 lines:
95
+
96
+ French sentence
97
+ (pronunciation)
98
+ [translation]
99
+
100
+ Example - if student says "Bonjour":
101
+ Bonjour! Comment allez-vous?
102
+ (bohn-ZHOOR! koh-mahn tah-lay VOO?)
103
+ [Hello! How are you?]
104
+
105
+ ONE sentence response only. NO additional dialogue."""
106
+
107
+ def validate_response_format(response: str) -> Tuple[bool, str]:
108
+ lines = response.strip().split('\n')
109
+ cleaned_lines = []
110
+ for line in lines:
111
+ line = line.strip()
112
+ if any(marker in line.lower() for marker in ['you:', 'user:', 'student:', 'me:', 'moi:']):
113
+ continue
114
+ if 'what do you' in line.lower() or "qu'est-ce que" in line.lower():
115
+ continue
116
+ if line:
117
+ cleaned_lines.append(line)
118
+ french_line = None
119
+ pronunciation_line = None
120
+ translation_line = None
121
+ for i, line in enumerate(cleaned_lines):
122
+ if '(' in line and ')' in line and not pronunciation_line:
123
+ pronunciation_line = line
124
+ if i > 0 and not french_line:
125
+ french_line = cleaned_lines[i-1]
126
+ elif '[' in line and ']' in line and not translation_line:
127
+ translation_line = line
128
+ if not french_line:
129
+ for line in cleaned_lines:
130
+ if line and not any(c in line for c in ['(', ')', '[', ']', '*']):
131
+ french_line = line
132
+ break
133
+ if french_line:
134
+ if not pronunciation_line:
135
+ pronunciation_line = "(pronunciation guide not available)"
136
+ if not translation_line:
137
+ translation_line = "[translation not available]"
138
+ return True, f"{french_line}\n{pronunciation_line}\n{translation_line}"
139
+ return False, response
140
+
141
+ def generate_scenario():
142
+ """Generate initial scenario and hints"""
143
+ try:
144
+ # List of diverse topics
145
+ topics = [
146
+ {
147
+ "name": "Daily Routine",
148
+ "phrases": [
149
+ "Je me rΓ©veille Γ ... (zhuh muh ray-vay ah) [I wake up at...]",
150
+ "Je prends le petit dΓ©jeuner (zhuh prahn luh puh-tee day-zhuh-nay) [I have breakfast]",
151
+ "Je travaille de... Γ ... (zhuh trah-vay duh... ah) [I work from... to...]",
152
+ "Le soir, je... (luh swahr, zhuh) [In the evening, I...]"
153
+ ],
154
+ "opening": "Γ€ quelle heure vous levez-vous le matin?\n(ah kel uhr voo luh-vay voo luh mah-tahn?)\n[What time do you get up in the morning?]"
155
+ },
156
+ {
157
+ "name": "Favorite Foods",
158
+ "phrases": [
159
+ "Mon plat prΓ©fΓ©rΓ© est... (mohn plah pray-fay-ray ay) [My favorite dish is...]",
160
+ "J'adore... (zhah-dohr) [I love...]",
161
+ "Je n'aime pas... (zhuh nehm pah) [I don't like...]",
162
+ "C'est dΓ©licieux! (say day-lee-see-uh) [It's delicious!]"
163
+ ],
164
+ "opening": "Quel est votre plat prΓ©fΓ©rΓ©?\n(kel ay voh-truh plah pray-fay-ray?)\n[What is your favorite dish?]"
165
+ },
166
+ {
167
+ "name": "Work and Career",
168
+ "phrases": [
169
+ "Je travaille comme... (zhuh trah-vay kohm) [I work as...]",
170
+ "Mon bureau est... (mohn bew-roh ay) [My office is...]",
171
+ "J'aime mon travail (zhehm mohn trah-vay) [I like my job]",
172
+ "Mes collègues sont... (may koh-lehg sohn) [My colleagues are...]"
173
+ ],
174
+ "opening": "Qu'est-ce que vous faites comme travail?\n(kess-kuh voo feht kohm trah-vay?)\n[What do you do for work?]"
175
+ },
176
+ {
177
+ "name": "Music and Hobbies",
178
+ "phrases": [
179
+ "J'Γ©coute... (zhay-koot) [I listen to...]",
180
+ "Mon chanteur prΓ©fΓ©rΓ© est... (mohn shahn-tuhr pray-fay-ray ay) [My favorite singer is...]",
181
+ "Je joue de... (zhuh zhoo duh) [I play (instrument)...]",
182
+ "Dans mon temps libre... (dahn mohn tahn lee-bruh) [In my free time...]"
183
+ ],
184
+ "opening": "Quel type de musique aimez-vous?\n(kel teep duh mew-zeek ay-may voo?)\n[What type of music do you like?]"
185
+ },
186
+ {
187
+ "name": "Weekend Plans",
188
+ "phrases": [
189
+ "Ce weekend, je vais... (suh wee-kehnd, zhuh vay) [This weekend, I'm going to...]",
190
+ "J'aimerais... (zheh-muh-ray) [I would like to...]",
191
+ "Avec mes amis... (ah-vek may zah-mee) [With my friends...]",
192
+ "Γ‡a sera amusant! (sah suh-rah ah-mew-zahn) [It will be fun!]"
193
+ ],
194
+ "opening": "Qu'est-ce que vous faites ce weekend?\n(kess-kuh voo feht suh wee-kehnd?)\n[What are you doing this weekend?]"
195
+ },
196
+ {
197
+ "name": "Family and Friends",
198
+ "phrases": [
199
+ "Ma famille habite... (mah fah-mee ah-beet) [My family lives...]",
200
+ "J'ai... frères/soeurs (zhay... frehr/suhr) [I have... brothers/sisters]",
201
+ "Mon meilleur ami... (mohn may-yuhr ah-mee) [My best friend...]",
202
+ "Nous aimons... ensemble (noo zeh-mohn... ahn-sahm-bluh) [We like to... together]"
203
+ ],
204
+ "opening": "Parlez-moi de votre famille!\n(pahr-lay mwah duh voh-truh fah-mee!)\n[Tell me about your family!]"
205
+ },
206
+ {
207
+ "name": "Weather and Seasons",
208
+ "phrases": [
209
+ "Il fait beau/mauvais (eel feh boh/moh-veh) [The weather is nice/bad]",
210
+ "J'aime l'Γ©tΓ©/l'hiver (zhehm lay-tay/lee-vehr) [I like summer/winter]",
211
+ "Il pleut souvent (eel pluh soo-vahn) [It rains often]",
212
+ "Ma saison prΓ©fΓ©rΓ©e est... (mah seh-zohn pray-fay-ray ay) [My favorite season is...]"
213
+ ],
214
+ "opening": "Quel temps fait-il aujourd'hui?\n(kel tahn feh-teel oh-zhoor-dwee?)\n[What's the weather like today?]"
215
+ },
216
+ {
217
+ "name": "Travel and Vacations",
218
+ "phrases": [
219
+ "J'ai visitΓ©... (zhay vee-zee-tay) [I visited...]",
220
+ "Je voudrais aller Γ ... (zhuh voo-dray ah-lay ah) [I would like to go to...]",
221
+ "En vacances, je... (ahn vah-kahns, zhuh) [On vacation, I...]",
222
+ "C'Γ©tait magnifique! (say-teh mahn-yee-feek) [It was magnificent!]"
223
+ ],
224
+ "opening": "OΓΉ aimez-vous voyager?\n(oo ay-may voo vwah-yah-zhay?)\n[Where do you like to travel?]"
225
+ }
226
+ ]
227
+
228
+ # Select a random topic
229
+ selected_topic = random.choice(topics)
230
+
231
+ # Format the scenario directly without using LLM
232
+ scenario = f"""**Topic: {selected_topic['name']}**
233
+
234
+ **Helpful phrases:**
235
+ - {selected_topic['phrases'][0]}
236
+ - {selected_topic['phrases'][1]}
237
+ - {selected_topic['phrases'][2]}
238
+ - {selected_topic['phrases'][3]}
239
+
240
+ {selected_topic['opening']}"""
241
+
242
+ return scenario
243
+
244
+ except Exception as e:
245
+ return f"Error generating scenario: {str(e)}"
246
+
247
+ def extract_french_for_tts(text: str) -> str:
248
+ """Extract only the French text (first line without parentheses/brackets)"""
249
+ lines = text.strip().split('\n')
250
+ for line in lines:
251
+ line = line.strip()
252
+ if line and '(' not in line and '[' not in line and '*' not in line and not line.startswith('**'):
253
+ return line
254
+ return ""
255
+
256
+ def process_speech_to_text(audio_tuple) -> Tuple[str, bool]:
257
+ """Convert audio to text using Groq Whisper"""
258
+ if audio_tuple is None:
259
+ return "No audio received", False
260
+ try:
261
+ sample_rate, audio_data = audio_tuple
262
+ wav_buffer = io.BytesIO()
263
+ sf.write(wav_buffer, audio_data, sample_rate, format='WAV')
264
+ wav_buffer.seek(0)
265
+ transcription = groq_client.audio.transcriptions.create(
266
+ file=("audio.wav", wav_buffer),
267
+ model="whisper-large-v3-turbo",
268
+ language="fr"
269
+ )
270
+ return transcription.text, True
271
+ except Exception as e:
272
+ error_msg = str(e)
273
+ if "401" in error_msg or "Invalid API Key" in error_msg:
274
+ return "Error: Invalid Groq API key. Please check your GROQ_API_KEY.", False
275
+ elif "quota" in error_msg.lower():
276
+ return "Error: Groq API quota exceeded. Please check your account.", False
277
+ else:
278
+ return f"Error in speech recognition: {error_msg}", False
279
+
280
+ def generate_tutor_response(conversation_history: List[Dict], user_text: str) -> str:
281
+ global current_llm
282
+
283
+ # Try Mistral first
284
+ if mistral_client:
285
+ try:
286
+ messages = [
287
+ {"role": "system", "content": get_system_prompt()}
288
+ ]
289
+ for msg in conversation_history:
290
+ role = "user" if msg["role"] == "user" else "assistant"
291
+ messages.append({"role": role, "content": msg["content"]})
292
+ messages.append({"role": "user", "content": user_text})
293
+
294
+ response = mistral_client.chat.complete(
295
+ model="mistral-large-latest",
296
+ messages=messages
297
+ )
298
+ raw_response = response.choices[0].message.content
299
+ current_llm = "Mistral AI"
300
+
301
+ is_valid, cleaned_response = validate_response_format(raw_response)
302
+ if not is_valid:
303
+ french_text = extract_french_for_tts(raw_response)
304
+ if french_text:
305
+ cleaned_response = f"{french_text}\n(pronunciation not available)\n[translation not available]"
306
+ return cleaned_response
307
+ except Exception as e:
308
+ print(f"Mistral error: {str(e)}, falling back to Gemini")
309
+ if not gemini_api_key:
310
+ return f"Error: Mistral failed and no Gemini fallback available: {str(e)}"
311
+
312
+ # Fallback to Gemini
313
+ if gemini_api_key:
314
+ try:
315
+ genai.configure(api_key=gemini_api_key)
316
+ model = genai.GenerativeModel("models/gemini-1.5-flash-latest")
317
+ messages = [
318
+ {"role": "user", "parts": [get_system_prompt()]}
319
+ ]
320
+ for msg in conversation_history:
321
+ messages.append({"role": msg["role"], "parts": [msg["content"]]})
322
+ messages.append({"role": "user", "parts": [user_text]})
323
+ response = model.generate_content(messages)
324
+ raw_response = response.text
325
+ current_llm = "Google Gemini (Fallback)"
326
+
327
+ is_valid, cleaned_response = validate_response_format(raw_response)
328
+ if not is_valid:
329
+ french_text = extract_french_for_tts(raw_response)
330
+ if french_text:
331
+ cleaned_response = f"{french_text}\n(pronunciation not available)\n[translation not available]"
332
+ return cleaned_response
333
+ except Exception as e:
334
+ return f"Error: Both Mistral and Gemini failed: {str(e)}"
335
+
336
+ return "Error: No LLM available"
337
+
338
+ def text_to_speech(text: str) -> str:
339
+ global temp_audio_files
340
+ try:
341
+ french_text = extract_french_for_tts(text)
342
+ if not french_text:
343
+ return None
344
+ # Use Groq TTS
345
+ tts_response = groq_client.audio.speech.create(
346
+ model="tts-1", # or "tts-1-hd" for higher quality
347
+ voice="alloy", # or another supported voice, e.g., "echo", "fable", "onyx", "nova"
348
+ input=french_text
349
+ )
350
+ temp_dir = tempfile.mkdtemp()
351
+ temp_path = os.path.join(temp_dir, f"audio_{datetime.now().strftime('%Y%m%d_%H%M%S')}.mp3")
352
+ with open(temp_path, "wb") as f:
353
+ f.write(tts_response.content)
354
+ temp_audio_files.append(temp_path)
355
+ cleanup_old_audio_files()
356
+ return temp_path
357
+ except Exception as e:
358
+ error_msg = str(e)
359
+ if "401" in error_msg or "Invalid API Key" in error_msg:
360
+ print(f"Groq TTS Error: Invalid API key, falling back to gTTS")
361
+ else:
362
+ print(f"Groq TTS Error: {error_msg}, falling back to gTTS")
363
+ # Fallback to gTTS if Groq fails
364
+ try:
365
+ from gtts import gTTS
366
+ tts = gTTS(text=french_text, lang='fr')
367
+ temp_dir = tempfile.mkdtemp()
368
+ temp_path = os.path.join(temp_dir, f"audio_{datetime.now().strftime('%Y%m%d_%H%M%S')}.mp3")
369
+ tts.save(temp_path)
370
+ temp_audio_files.append(temp_path)
371
+ cleanup_old_audio_files()
372
+ return temp_path
373
+ except Exception as e2:
374
+ print(f"gTTS Fallback Error: {str(e2)}")
375
+ return None
376
+
377
+ def analyze_conversation(full_transcript: List[Dict]) -> str:
378
+ global current_llm
379
+
380
+ transcript_text = "\n".join([
381
+ f"{msg['role']}: {msg['content']}" for msg in full_transcript
382
+ ])
383
+ analysis_prompt = """Analyze this French conversation and provide:\n1. Grammar corrections with specific examples\n2. Pronunciation tips for common mistakes\n3. Vocabulary suggestions to improve fluency\n4. Overall assessment with encouragement\n\nBe specific, constructive, and encouraging. Format clearly with sections."""
384
+
385
+ # Try Mistral first
386
+ if mistral_client:
387
+ try:
388
+ messages = [
389
+ {"role": "system", "content": analysis_prompt},
390
+ {"role": "user", "content": f"Analyze this conversation:\n{transcript_text}"}
391
+ ]
392
+
393
+ response = mistral_client.chat.complete(
394
+ model="mistral-large-latest",
395
+ messages=messages
396
+ )
397
+ current_llm = "Mistral AI"
398
+ return response.choices[0].message.content
399
+ except Exception as e:
400
+ print(f"Mistral error in analysis: {str(e)}, falling back to Gemini")
401
+
402
+ # Fallback to Gemini
403
+ if gemini_api_key:
404
+ try:
405
+ genai.configure(api_key=gemini_api_key)
406
+ model = genai.GenerativeModel("models/gemini-1.5-flash-latest")
407
+ messages = [
408
+ {"role": "user", "parts": [analysis_prompt]},
409
+ {"role": "user", "parts": [f"Analyze this conversation:\n{transcript_text}"]}
410
+ ]
411
+ response = model.generate_content(messages)
412
+ current_llm = "Google Gemini (Fallback)"
413
+ return response.text
414
+ except Exception as e:
415
+ return f"Error generating analysis: {str(e)}"
416
+
417
+ return "Error: No LLM available for analysis"
418
+
419
+ def create_app():
420
+ with gr.Blocks(title="French Tutor", theme=gr.themes.Soft()) as app:
421
+ # State management
422
+ conversation_state = gr.State([])
423
+ exchange_count = gr.State(0)
424
+ full_transcript = gr.State([])
425
+ current_scenario = gr.State("")
426
+
427
+ gr.Markdown("# πŸ‡«πŸ‡· French Conversation Tutor")
428
+ gr.Markdown("Practice French through natural conversation! (3 exchanges per session)")
429
+
430
+ # Model info banner
431
+ with gr.Row():
432
+ model_info = gr.Markdown(
433
+ f"**πŸ€– Models:** LLM: {current_llm} | STT: Groq Whisper | TTS: gTTS",
434
+ elem_id="model-info"
435
+ )
436
+
437
+ # Main layout with two columns
438
+ with gr.Row():
439
+ # Left sidebar (30% width)
440
+ with gr.Column(scale=3):
441
+ gr.Markdown("## πŸ“š Control Panel")
442
+
443
+ # Start/New Topic buttons
444
+ start_btn = gr.Button("Start New Conversation", variant="primary", size="lg")
445
+ new_topic_btn = gr.Button("🎲 Generate New Topic & Restart", variant="secondary", visible=False)
446
+
447
+ # Topic display in sidebar
448
+ with gr.Group():
449
+ gr.Markdown("### Current Topic")
450
+ sidebar_scenario = gr.Markdown("Click 'Start' to begin", elem_id="sidebar-scenario")
451
+
452
+ # Analysis section in sidebar
453
+ with gr.Group(visible=False) as analysis_group:
454
+ gr.Markdown("### πŸ“Š Your Analysis")
455
+ analysis_box = gr.Markdown()
456
+ restart_btn = gr.Button("πŸ”„ Start Another Conversation", variant="secondary", size="lg")
457
+
458
+ # Status in sidebar
459
+ status_text = gr.Textbox(
460
+ label="System Status",
461
+ value="Ready to start",
462
+ interactive=False
463
+ )
464
+
465
+ # Right main content (70% width)
466
+ with gr.Column(scale=7):
467
+ # Conversation interface
468
+ with gr.Column(visible=False) as conversation_ui:
469
+ gr.Markdown("## πŸ’¬ Conversation")
470
+
471
+ # Chat display - always visible
472
+ chat_display = gr.Markdown(value="", elem_id="chat-display")
473
+
474
+ # Progress indicator
475
+ progress_text = gr.Textbox(
476
+ label="Progress",
477
+ value="Ready to start",
478
+ interactive=False
479
+ )
480
+
481
+ # Audio interface
482
+ with gr.Row():
483
+ audio_input = gr.Audio(
484
+ sources=["microphone"],
485
+ type="numpy",
486
+ label="🎀 Record your response in French"
487
+ )
488
+ record_btn = gr.Button("Send Response", variant="primary")
489
+
490
+ # Tutor's audio response
491
+ audio_output = gr.Audio(
492
+ label="πŸ”Š Tutor's Response",
493
+ type="filepath",
494
+ autoplay=True
495
+ )
496
+
497
+ def reset_conversation_states():
498
+ """Helper to reset all conversation states"""
499
+ return [], 0, [], "", gr.update(value=None)
500
+
501
+ def start_conversation(scenario_text=None):
502
+ """Initialize a new conversation"""
503
+ # Reset global state
504
+ global current_llm
505
+
506
+ print("Starting new conversation...")
507
+
508
+ # Generate scenario if not provided
509
+ if scenario_text is None:
510
+ scenario = generate_scenario()
511
+ else:
512
+ scenario = scenario_text
513
+
514
+ # Extract the tutor's first message for audio
515
+ audio_path = text_to_speech(scenario)
516
+ if audio_path is None:
517
+ audio_path = gr.update() # No change to audio output
518
+
519
+ # Format the scenario for display
520
+ scenario_display = scenario.strip()
521
+
522
+ # Create fresh empty states
523
+ new_conversation_state = []
524
+ new_full_transcript = []
525
+ new_exchange_count = 0
526
+
527
+ print(f"Reset states - Exchange count: {new_exchange_count}, History length: {len(new_conversation_state)}")
528
+
529
+ return (
530
+ gr.update(visible=True), # conversation_ui
531
+ scenario_display, # sidebar_scenario
532
+ scenario, # current_scenario state
533
+ "", # clear chat_display
534
+ new_exchange_count, # reset exchange_count
535
+ new_conversation_state, # reset conversation_state
536
+ new_full_transcript, # reset full_transcript
537
+ audio_path, # play initial audio
538
+ "Ready to start - 3 exchanges to go", # progress
539
+ gr.update(visible=False), # hide analysis_group
540
+ gr.update(visible=False), # hide start_btn
541
+ gr.update(visible=True), # show new_topic_btn
542
+ gr.update(value=None), # clear audio input
543
+ gr.update(interactive=True), # enable record button
544
+ "Ready to start" # status text
545
+ )
546
+
547
+ def generate_new_topic_and_start():
548
+ """Generate a new topic and start the conversation"""
549
+ scenario = generate_scenario()
550
+
551
+ # Return all the values that start_conversation returns
552
+ result = start_conversation(scenario)
553
+
554
+ # Update the progress text
555
+ result_list = list(result)
556
+ result_list[8] = "New topic generated! Ready to start - 3 exchanges to go" # Update progress text
557
+
558
+ return tuple(result_list)
559
+
560
+ def process_user_audio(audio, chat_text, exchanges, history, transcript, scenario):
561
+ """Process user's audio input and generate response"""
562
+ global current_llm
563
+
564
+ print(f"Processing audio - Exchange count: {exchanges}, History length: {len(history) if history else 0}")
565
+
566
+ # Ensure exchange count is an integer
567
+ if exchanges is None:
568
+ exchanges = 0
569
+
570
+ # Check if conversation is complete
571
+ if exchanges >= 3:
572
+ return (
573
+ chat_text, exchanges, history, transcript,
574
+ "Conversation complete! Check your analysis in the sidebar.",
575
+ f"Exchange {exchanges} of 3 - Complete!",
576
+ gr.update(), gr.update(value=None),
577
+ gr.update() # no change to model_info
578
+ )
579
+
580
+ # Ensure states are properly initialized
581
+ if history is None:
582
+ history = []
583
+ if transcript is None:
584
+ transcript = []
585
+ if chat_text is None:
586
+ chat_text = ""
587
+
588
+ # Check for audio
589
+ if audio is None:
590
+ return (
591
+ chat_text, exchanges, history, transcript,
592
+ "Please record audio first",
593
+ f"Exchange {exchanges} of 3",
594
+ gr.update(), gr.update(value=None),
595
+ gr.update() # no change to model_info
596
+ )
597
+
598
+ # Transcribe user's speech
599
+ user_text, success = process_speech_to_text(audio)
600
+
601
+ if not success:
602
+ return (
603
+ chat_text, exchanges, history, transcript,
604
+ user_text, # Error message
605
+ f"Exchange {exchanges} of 3",
606
+ gr.update(), gr.update(value=None),
607
+ gr.update() # no change to model_info
608
+ )
609
+
610
+ # Update chat display with user's message
611
+ if chat_text:
612
+ chat_text += f"\n\n**You:** {user_text}"
613
+ else:
614
+ # First message - include scenario context
615
+ chat_text = f"{scenario}\n\n---\n\n**You:** {user_text}"
616
+
617
+ # Get tutor's response
618
+ tutor_response = generate_tutor_response(history, user_text)
619
+
620
+ # Generate audio for tutor's response
621
+ audio_path = text_to_speech(tutor_response)
622
+ if audio_path is None:
623
+ audio_path = gr.update() # No change to audio output
624
+
625
+ # Update chat display with tutor's response
626
+ chat_text += f"\n\n**Mr. Mistral:**\n{tutor_response}"
627
+
628
+ # Update conversation history (for context)
629
+ history.append({"role": "user", "content": user_text})
630
+ history.append({"role": "assistant", "content": tutor_response})
631
+
632
+ # Update transcript (for analysis)
633
+ transcript.extend([
634
+ {"role": "user", "content": user_text},
635
+ {"role": "assistant", "content": tutor_response}
636
+ ])
637
+
638
+ # Increment exchange counter
639
+ exchanges += 1
640
+
641
+ # Check if this was the last exchange
642
+ if exchanges >= 3:
643
+ progress_msg = "Exchange 3 of 3 - Complete! Analysis ready."
644
+ else:
645
+ progress_msg = f"Exchange {exchanges} of 3 - Keep going!"
646
+
647
+ # Update model info
648
+ model_info_text = f"**πŸ€– Models:** LLM: {current_llm} | STT: Groq Whisper | TTS: gTTS"
649
+
650
+ # Return updated state
651
+ return (
652
+ chat_text,
653
+ exchanges,
654
+ history,
655
+ transcript,
656
+ f"Great! {progress_msg}",
657
+ progress_msg,
658
+ audio_path,
659
+ gr.update(value=None), # Clear audio input properly
660
+ gr.update(value=model_info_text) # Update model info
661
+ )
662
+
663
+ def show_analysis_if_complete(exchanges, transcript):
664
+ """Show analysis in sidebar if conversation is complete"""
665
+ if exchanges >= 3:
666
+ analysis = analyze_conversation(transcript)
667
+ return (
668
+ gr.update(visible=True, value=analysis), # analysis_box with content
669
+ gr.update(visible=True), # analysis_group
670
+ gr.update(interactive=False), # disable record button
671
+ gr.update(visible=False) # hide new topic button
672
+ )
673
+ return (
674
+ gr.update(), # no change to analysis_box
675
+ gr.update(), # no change to analysis_group
676
+ gr.update(interactive=True), # keep record button enabled
677
+ gr.update() # no change to new topic button
678
+ )
679
+
680
+ # Initialize API on load
681
+ def check_initialization():
682
+ status_msgs = []
683
+ if mistral_client:
684
+ status_msgs.append("βœ“ Mistral AI ready")
685
+ if gemini_api_key:
686
+ status_msgs.append("βœ“ Gemini fallback ready")
687
+ if groq_client:
688
+ status_msgs.append("βœ“ Groq STT ready")
689
+ status_msgs.append("βœ“ gTTS ready")
690
+
691
+ if not status_msgs:
692
+ return "❌ No APIs initialized!"
693
+
694
+ return " | ".join(status_msgs)
695
+
696
+ app.load(
697
+ fn=check_initialization,
698
+ outputs=status_text
699
+ )
700
+
701
+ # Start conversation
702
+ start_btn.click(
703
+ fn=start_conversation,
704
+ outputs=[
705
+ conversation_ui, sidebar_scenario, current_scenario,
706
+ chat_display, exchange_count, conversation_state,
707
+ full_transcript, audio_output, progress_text,
708
+ analysis_group, start_btn, new_topic_btn,
709
+ audio_input, record_btn, status_text
710
+ ]
711
+ )
712
+
713
+ # Generate new topic and start conversation
714
+ new_topic_btn.click(
715
+ fn=generate_new_topic_and_start,
716
+ outputs=[
717
+ conversation_ui, sidebar_scenario, current_scenario,
718
+ chat_display, exchange_count, conversation_state,
719
+ full_transcript, audio_output, progress_text,
720
+ analysis_group, start_btn, new_topic_btn,
721
+ audio_input, record_btn, status_text
722
+ ]
723
+ )
724
+
725
+ # Process user audio
726
+ record_btn.click(
727
+ fn=process_user_audio,
728
+ inputs=[
729
+ audio_input, chat_display, exchange_count,
730
+ conversation_state, full_transcript, current_scenario
731
+ ],
732
+ outputs=[
733
+ chat_display, exchange_count, conversation_state,
734
+ full_transcript, status_text, progress_text,
735
+ audio_output, audio_input, model_info
736
+ ],
737
+ queue=False # Disable queueing to avoid state issues
738
+ ).then(
739
+ fn=show_analysis_if_complete,
740
+ inputs=[exchange_count, full_transcript],
741
+ outputs=[analysis_box, analysis_group, record_btn, new_topic_btn],
742
+ queue=False # Disable queueing to avoid state issues
743
+ )
744
+
745
+ # Restart conversation
746
+ restart_btn.click(
747
+ fn=start_conversation,
748
+ outputs=[
749
+ conversation_ui, sidebar_scenario, current_scenario,
750
+ chat_display, exchange_count, conversation_state,
751
+ full_transcript, audio_output, progress_text,
752
+ analysis_group, start_btn, new_topic_btn,
753
+ audio_input, record_btn, status_text
754
+ ]
755
+ )
756
+
757
+ return app
758
+
759
+ # Launch the app
760
+ if __name__ == "__main__":
761
+ try:
762
+ app = create_app()
763
+ app.launch()
764
+ except Exception as e:
765
+ print(f"Failed to start app: {e}")
pyproject.toml ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [project]
2
+ name = "mr-mistral"
3
+ version = "0.1.0"
4
+ description = "French Conversation Tutor using Mistral AI with Gemini fallback and Groq APIs"
5
+ readme = "README.md"
6
+ requires-python = ">=3.9"
7
+ dependencies = [
8
+ "python-dotenv>=1.0.0",
9
+ "gradio>=4.31.0",
10
+ "groq>=0.30.0",
11
+ "gtts>=2.5.4",
12
+ "mistralai>=1.2.0",
13
+ "numpy>=1.26.0",
14
+ "google-generativeai>=0.5.0",
15
+ "soundfile>=0.12.1",
16
+ "openai>=1.95.1",
17
+ ]
requirements.txt ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ python-dotenv>=1.0.0
2
+ gradio>=4.31.0
3
+ groq>=0.30.0
4
+ gtts>=2.5.4
5
+ mistralai>=1.2.0
6
+ numpy>=1.26.0
7
+ google-generativeai>=0.5.0
8
+ soundfile>=0.12.1
9
+ openai>=1.0.0
test.py ADDED
@@ -0,0 +1,128 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Test script to verify API keys are working correctly
4
+ Run this before launching the main app to ensure all APIs are accessible
5
+ """
6
+
7
+ import os
8
+ from dotenv import load_dotenv
9
+ import openai
10
+ import numpy as np
11
+ import soundfile as sf
12
+ import io
13
+
14
+ # Load environment variables
15
+ load_dotenv()
16
+ openai_api_key = os.environ.get("OPENAI_API_KEY")
17
+
18
+ def test_mistral():
19
+ """Test Mistral API"""
20
+ try:
21
+ from mistralai import Mistral
22
+ api_key = os.environ.get("MISTRAL_API_KEY")
23
+ if not api_key:
24
+ print("❌ MISTRAL_API_KEY not found in environment")
25
+ return False
26
+
27
+ client = Mistral(api_key=api_key)
28
+ response = client.chat.complete(
29
+ model="mistral-large-latest",
30
+ messages=[{"role": "user", "content": "Say 'Hello' in French"}]
31
+ )
32
+ print(f"βœ… Mistral API working: {response.choices[0].message.content[:50]}...")
33
+ return True
34
+ except Exception as e:
35
+ print(f"❌ Mistral API error: {str(e)}")
36
+ return False
37
+
38
+ def test_gemini():
39
+ """Test Gemini API"""
40
+ try:
41
+ import google.generativeai as genai
42
+ api_key = os.environ.get("GEMINI_API_KEY")
43
+ if not api_key:
44
+ print("⚠️ GEMINI_API_KEY not found (optional fallback)")
45
+ return False
46
+
47
+ genai.configure(api_key=api_key)
48
+ model = genai.GenerativeModel("models/gemini-1.5-flash-latest")
49
+ response = model.generate_content("Say 'Hello' in French")
50
+ print(f"βœ… Gemini API working: {response.text[:50]}...")
51
+ return True
52
+ except Exception as e:
53
+ print(f"⚠️ Gemini API error (fallback): {str(e)}")
54
+ return False
55
+
56
+ def test_groq():
57
+ """Test Groq API"""
58
+ try:
59
+ from groq import Groq
60
+ client = Groq(api_key=os.environ.get("GROQ_API_KEY"),)
61
+ if not client:
62
+ print("❌ GROQ_API_KEY not found in environment")
63
+ return False
64
+
65
+ # client = Groq(api_key=api_key)
66
+ # Test with a simple completion
67
+ response = client.chat.completions.create(
68
+ messages=[
69
+ {
70
+ "role": "user",
71
+ "content": "Explain the importance of fast language models",
72
+ }
73
+ ],
74
+ model="llama-3.3-70b-versatile",
75
+ )
76
+ print(f"βœ… Groq API working: {response.choices[0].message.content}")
77
+ return True
78
+ except Exception as e:
79
+ print(f"❌ Groq API error: {str(e)}")
80
+ return False
81
+
82
+ def test_openai_whisper():
83
+ """Test OpenAI Whisper API (STT)"""
84
+ if not openai_api_key:
85
+ print("⚠️ OPENAI_API_KEY not found (OpenAI Whisper fallback not available)")
86
+ return False
87
+ try:
88
+ # Generate a 0.5s dummy silent audio (16kHz mono)
89
+ sr = 16000
90
+ duration = 0.5
91
+ audio = np.zeros(int(sr * duration), dtype=np.float32)
92
+ buf = io.BytesIO()
93
+ sf.write(buf, audio, sr, format='WAV')
94
+ buf.seek(0)
95
+ openai.api_key = openai_api_key
96
+ response = openai.audio.transcriptions.create(
97
+ model="whisper-1",
98
+ file=("audio.wav", buf),
99
+ language="fr"
100
+ )
101
+ print(f"βœ… OpenAI Whisper API working: {response.text}")
102
+ return True
103
+ except Exception as e:
104
+ print(f"❌ OpenAI Whisper API error: {str(e)}")
105
+ return False
106
+
107
+ def main():
108
+ print("πŸ” Testing API Keys...\n")
109
+
110
+ mistral_ok = test_mistral()
111
+ gemini_ok = test_gemini()
112
+ groq_ok = test_groq()
113
+ openai_ok = test_openai_whisper()
114
+
115
+ print("\nπŸ“Š Summary:")
116
+ if mistral_ok and (groq_ok or openai_ok):
117
+ print("βœ… All required APIs are working! You can run the app.")
118
+ elif not mistral_ok and gemini_ok and (groq_ok or openai_ok):
119
+ print("βœ… Gemini fallback and Groq/OpenAI Whisper are working. The app will use Gemini for LLM.")
120
+ else:
121
+ print("❌ Some required APIs are not working. Please check your API keys.")
122
+ if not groq_ok and not openai_ok:
123
+ print(" - Groq or OpenAI Whisper is required for speech-to-text")
124
+ if not mistral_ok and not gemini_ok:
125
+ print(" - Either Mistral or Gemini is required for the language model")
126
+
127
+ if __name__ == "__main__":
128
+ main()