ahundt commited on
Commit
2d3e55a
·
1 Parent(s): b7a0a78

app.py partly works, could chat with AI, has exit error

Browse files

2025-02-20 22:25:41,864 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,864 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,864 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,864 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,864 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
Traceback (most recent call last):
File "/Users/athundt/source/gemini-audio-video-chat/.venv/lib/python3.13/site-packages/gradio/blocks.py", line 2923, in block_thread
time.sleep(0.1)
~~~~~~~~~~^^^^^
2025-02-20 22:25:41,866 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
KeyboardInterrupt
2025-02-20 22:25:41,866 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
2025-02-20 22:25:41,866 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,866 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
File "/Users/athundt/source/gemini-audio-video-chat/app.py", line 329, in <module>
demo.launch()
~~~~~~~~~~~^^
2025-02-20 22:25:41,867 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
File "/Users/athundt/source/gemini-audio-video-chat/.venv/lib/python3.13/site-packages/gradio/blocks.py", line 2829, in launch
self.block_thread()
~~~~~~~~~~~~~~~~~^^
2025-02-20 22:25:41,867 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
File "/Users/athundt/source/gemini-audio-video-chat/.venv/lib/python3.13/site-packages/gradio/blocks.py", line 2927, in block_thread
self.server.close()
~~~~~~~~~~~~~~~~~^^
2025-02-20 22:25:41,867 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
File "/Users/athundt/source/gemini-audio-video-chat/.venv/lib/python3.13/site-packages/gradio/http_server.py", line 69, in close
self.thread.join(timeout=5)
~~~~~~~~~~~~~~~~^^^^^^^^^^^
2025-02-20 22:25:41,867 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
File "/usr/local/Cellar/[email protected]/3.13.2/Frameworks/Python.framework/Versions/3.13/lib/python3.13/threading.py", line 1092, in join
self._handle.join(timeout)
~~~~~~~~~~~~~~~~~^^^^^^^^^
KeyboardInterrupt
2025-02-20 22:25:41,867 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)

Files changed (1) hide show
  1. app.py +144 -155
app.py CHANGED
@@ -4,7 +4,7 @@ import os
4
  import time
5
  from io import BytesIO
6
  import logging
7
- import traceback # Import traceback
8
 
9
  import gradio as gr
10
  import numpy as np
@@ -18,30 +18,30 @@ from gradio_webrtc import (
18
  get_twilio_turn_credentials,
19
  )
20
  from PIL import Image
 
21
 
22
  # --- Setup Logging ---
23
  logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
24
  logger = logging.getLogger(__name__)
25
 
26
  # --- Global State ---
27
- twilio_available = None # None = not checked, True = available, False = unavailable
28
- gemini_connected = False # Track Gemini connection status
29
- load_complete = asyncio.Event() # Event to signal demo.load completion
30
 
31
 
32
  # --- Helper Functions ---
33
  def encode_audio(data: np.ndarray) -> dict:
34
- """Encode Audio data to send to the server."""
35
  if not isinstance(data, np.ndarray):
36
  raise TypeError("encode_audio expected a numpy.ndarray")
37
  try:
38
  return {"mime_type": "audio/pcm", "data": base64.b64encode(data.tobytes()).decode("UTF-8")}
39
  except Exception as e:
40
  logger.error(f"Error encoding audio: {e}")
41
- raise # Re-raise the exception after logging
42
 
43
  def encode_image(data: np.ndarray) -> dict:
44
- """Encode Image data to send to the server."""
45
  if not isinstance(data, np.ndarray):
46
  raise TypeError("encode_image expected a numpy.ndarray")
47
  try:
@@ -55,42 +55,35 @@ def encode_image(data: np.ndarray) -> dict:
55
  logger.error(f"Error encoding image: {e}")
56
  raise
57
 
58
- async def check_twilio_availability() -> bool:
59
- """Checks Twilio TURN server availability with retries and timeout."""
60
  global twilio_available
61
  timeout = 10
62
  retries = 3
63
  delay = 2
64
 
65
- try:
66
- async with asyncio.timeout(timeout):
67
- for attempt in range(retries):
68
- try:
69
- # VERY DETAILED LOGGING HERE
70
- logger.info(f"Attempting to get Twilio credentials (attempt {attempt + 1})...")
71
- credentials = get_twilio_turn_credentials()
72
- logger.info(f"Twilio credentials response: {credentials}") # Log the response
73
- if credentials:
74
- twilio_available = True
75
- logger.info("Twilio TURN server available.")
76
- return True
77
- except Exception as e:
78
- logger.warning(f"Attempt {attempt + 1} to get Twilio credentials failed: {e}")
79
- # Print the full traceback
80
- logger.warning(traceback.format_exc())
81
- if attempt < retries - 1:
82
- await asyncio.sleep(delay)
83
  twilio_available = False
84
- logger.warning("Twilio TURN server unavailable after multiple attempts.")
85
  return False
86
- except asyncio.TimeoutError:
87
- twilio_available = False
88
- logger.error(f"Twilio TURN server check timed out after {timeout} seconds.")
89
- return False
90
- except Exception as e:
91
- twilio_available = False
92
- logger.exception(f"Unexpected error checking Twilio availability: {e}")
93
- return False
94
 
95
 
96
 
@@ -121,7 +114,6 @@ class GeminiHandler(AsyncAudioVideoStreamHandler):
121
  async def video_receive(self, frame: np.ndarray):
122
  if self.session:
123
  try:
124
- # send image every 1 second
125
  if time.time() - self.last_frame_time > 1:
126
  self.last_frame_time = time.time()
127
  await self.session.send(encode_image(frame))
@@ -129,15 +121,15 @@ class GeminiHandler(AsyncAudioVideoStreamHandler):
129
  await self.session.send(encode_image(self.latest_args[2]))
130
  except Exception as e:
131
  logger.error(f"Error sending video frame: {e}")
132
- gr.Warning("Error sending video to Gemini. Check your connection and API key.")
133
- self.video_queue.put_nowait(frame) # Always put the frame in the queue
134
 
135
  async def video_emit(self) -> VideoEmitType:
136
  try:
137
  return await self.video_queue.get()
138
  except asyncio.CancelledError:
139
  logger.info("Video emit cancelled.")
140
- return None # Or some other default value
141
  except Exception as e:
142
  logger.exception(f"Error in video_emit: {e}")
143
  return None
@@ -148,51 +140,47 @@ class GeminiHandler(AsyncAudioVideoStreamHandler):
148
  try:
149
  client = genai.Client(api_key=api_key, http_options={"api_version": "v1alpha"})
150
  config = {"response_modalities": ["AUDIO"]}
151
- async with client.aio.live.connect(
152
  model="gemini-2.0-flash-exp", config=config
153
- ) as session:
154
  self.session = session
155
  gemini_connected = True
156
  asyncio.create_task(self.receive_audio())
157
  await self.quit.wait()
158
  except Exception as e:
159
  logger.error(f"Error connecting to Gemini: {e}")
160
- gemini_connected = False # Set connection status to False
161
  self.shutdown()
162
- # Display error in the UI
163
  gr.Warning(f"Failed to connect to Gemini: {e}")
164
- finally: # Update UI *after* connection attempt (both success and failure)
165
- gr.Info(f"Gemini connection status: {'Connected' if gemini_connected else 'Disconnected'}")
166
 
167
 
168
  async def generator(self):
169
- if not self.session: # Check if session exists
170
  logger.warning("Gemini session is not initialized.")
171
- return # Or raise an exception, depending on desired behavior
172
 
173
  while not self.quit.is_set():
174
  try:
175
- turn = await self.session.receive()
176
  async for response in turn:
177
  if data := response.data:
178
  yield data
179
  except Exception as e:
180
  logger.error(f"Error receiving from Gemini: {e}")
181
- gr.Warning("Error communicating with Gemini. Check network and API key.")
182
- break # Exit the loop on error
183
-
 
184
  async def receive_audio(self):
185
  try:
186
- async for audio_response in async_aggregate_bytes_to_16bit(
187
- self.generator()
188
- ):
189
  self.audio_queue.put_nowait(audio_response)
190
- except asyncio.CancelledError:
191
- logger.info("Audio receive cancelled.")
192
  except Exception as e:
193
  logger.exception(f"Error in receive_audio: {e}")
194
 
195
-
196
  async def receive(self, frame: tuple[int, np.ndarray]) -> None:
197
  _, array = frame
198
  array = array.squeeze()
@@ -202,17 +190,13 @@ class GeminiHandler(AsyncAudioVideoStreamHandler):
202
  await self.session.send(audio_message)
203
  except Exception as e:
204
  logger.error(f"Error sending audio: {e}")
205
- gr.Warning("Error sending audio to Gemini. Check your connection and API key.")
206
-
207
 
208
  async def emit(self) -> AudioEmitType:
209
  if not self.args_set.is_set():
210
  await self.wait_for_args()
211
  if self.session is None:
212
- try:
213
  asyncio.create_task(self.connect(self.latest_args[1]))
214
- except Exception as e:
215
- logger.error(f"emit error connecting: {e}")
216
 
217
  try:
218
  array = await self.audio_queue.get()
@@ -222,22 +206,28 @@ class GeminiHandler(AsyncAudioVideoStreamHandler):
222
  return (self.output_sample_rate, np.array([]))
223
  except Exception as e:
224
  logger.exception(f"Error in emit: {e}")
225
- return (self.output_sample_rate, np.array([])) # Return empty array on error
226
-
227
 
228
  def shutdown(self) -> None:
229
  global gemini_connected
230
- gemini_connected = False # Reset on shutdown
231
  logger.info("Shutting down GeminiHandler.")
232
  self.quit.set()
233
  self.connection = None
234
  self.args_set.clear()
235
  if self.session:
236
- # No good async close method, this can get stuck.
237
- # asyncio.create_task(self.session.close())
238
  pass
239
  self.quit.clear()
240
- gr.Info("Gemini connection closed.")
 
 
 
 
 
 
 
 
241
 
242
 
243
  # --- Gradio UI ---
@@ -245,96 +235,95 @@ css = """
245
  #video-source {max-width: 600px !important; max-height: 600 !important;}
246
  """
247
 
248
- async def main():
249
- global twilio_available, gemini_connected
250
-
251
- with gr.Blocks(css=css) as demo:
252
- gr.HTML(
253
- """
254
- <div style='display: flex; align-items: center; justify-content: center; gap: 20px'>
255
- <div style="background-color: var(--block-background-fill); border-radius: 8px">
256
- <img src="https://www.gstatic.com/lamda/images/gemini_favicon_f069958c85030456e93de685481c559f160ea06b.png" style="width: 100px; height: 100px;">
257
- </div>
258
- <div>
259
- <h1>Gen AI SDK Voice Chat</h1>
260
- <p>Speak with Gemini using real-time audio + video streaming</p>
261
- <p>Powered by <a href="https://gradio.app/">Gradio</a> and <a href=https://freddyaboulton.github.io/gradio-webrtc/">WebRTC</a>⚡️</p>
262
- <p>Get an API Key <a href="https://support.google.com/googleapi/answer/6158862?hl=en">here</a></p>
263
- </div>
264
- </div>
265
  """
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
266
  )
267
- twilio_status_message = gr.Markdown("") # For displaying Twilio status
268
- gemini_status_message = gr.Markdown("") # For Gemini status
269
-
270
- with gr.Row() as api_key_row:
271
- api_key = gr.Textbox(
272
- label="API Key",
273
- type="password",
274
- placeholder="Enter your API Key",
275
- value=os.getenv("GOOGLE_API_KEY"),
 
 
 
 
 
 
 
276
  )
277
- with gr.Row(visible=False) as row:
278
- with gr.Column():
279
- webrtc = WebRTC(
280
- label="Video Chat",
281
- modality="audio-video",
282
- mode="send-receive",
283
- elem_id="video-source",
284
- rtc_configuration={"iceServers": []}, # DUMMY CONFIGURATION
285
- icon="https://www.gstatic.com/lamda/images/gemini_favicon_f069958c85030456e93de685481c559f160ea06b.png",
286
- pulse_color="rgb(35, 157, 225)",
287
- icon_button_color="rgb(35, 157, 225)",
288
- )
289
- with gr.Column():
290
- image_input = gr.Image(label="Image", type="numpy", sources=["upload", "clipboard"])
291
-
292
-
293
- async def update_twilio_status_and_ui():
294
- """Updates Twilio status and UI elements."""
295
- await check_twilio_availability() # Check Twilio availability
296
-
297
- if twilio_available:
298
- rtc_config = get_twilio_turn_credentials()
299
- message = "Twilio TURN server available. Connection should be reliable."
300
- else:
301
- rtc_config = None
302
- message = "**Warning:** Twilio TURN server unavailable. Connection might be less reliable or fail if you are behind a symmetric NAT."
303
- load_complete.set() # Signal that load is complete - *before* returning
304
- return gr.update(rtc_configuration=rtc_config), gr.update(value=message)
305
-
306
- # Check Twilio availability and update UI on startup.
307
- demo.load(update_twilio_status_and_ui, [], [webrtc, twilio_status_message])
308
-
309
- async def start_streaming():
310
- """Starts the WebRTC streaming after load_complete is set."""
311
- await load_complete.wait() # *Wait* for load to complete
312
- await asyncio.sleep(0.1) # Small delay (optional, but can help)
313
- webrtc.stream(
314
- GeminiHandler(),
315
- inputs=[webrtc, api_key, image_input],
316
- outputs=[webrtc],
317
- time_limit=90,
318
- concurrency_limit=None, # Removed concurrency limit
319
  )
320
-
321
- # Use .then() to chain start_streaming *after* demo.load
322
- demo.load(None, [], []).then(start_streaming, [], [])
323
-
324
-
325
- def check_api_key(api_key_str):
326
- if not api_key_str:
327
- return gr.update(visible=True), gr.update(visible=False), gr.update(value="Please enter a valid API key")
328
- return gr.update(visible=False), gr.update(visible=True), gr.update(value="")
329
-
330
- api_key.submit(
331
- check_api_key,
332
- [api_key],
333
- [api_key_row, row, twilio_status_message],
334
  )
335
 
 
 
 
 
 
336
 
337
- demo.launch()
338
-
339
- if __name__ == "__main__":
340
- asyncio.run(main())
 
4
  import time
5
  from io import BytesIO
6
  import logging
7
+ import traceback
8
 
9
  import gradio as gr
10
  import numpy as np
 
18
  get_twilio_turn_credentials,
19
  )
20
  from PIL import Image
21
+ import requests # Use requests for synchronous Twilio check
22
 
23
  # --- Setup Logging ---
24
  logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
25
  logger = logging.getLogger(__name__)
26
 
27
  # --- Global State ---
28
+ twilio_available = None # Will be set *before* Gradio initialization
29
+ gemini_connected = False
 
30
 
31
 
32
  # --- Helper Functions ---
33
  def encode_audio(data: np.ndarray) -> dict:
34
+ """Encode Audio data."""
35
  if not isinstance(data, np.ndarray):
36
  raise TypeError("encode_audio expected a numpy.ndarray")
37
  try:
38
  return {"mime_type": "audio/pcm", "data": base64.b64encode(data.tobytes()).decode("UTF-8")}
39
  except Exception as e:
40
  logger.error(f"Error encoding audio: {e}")
41
+ raise
42
 
43
  def encode_image(data: np.ndarray) -> dict:
44
+ """Encode Image data."""
45
  if not isinstance(data, np.ndarray):
46
  raise TypeError("encode_image expected a numpy.ndarray")
47
  try:
 
55
  logger.error(f"Error encoding image: {e}")
56
  raise
57
 
58
+ def check_twilio_availability_sync() -> bool:
59
+ """Checks Twilio TURN server availability (synchronous version)."""
60
  global twilio_available
61
  timeout = 10
62
  retries = 3
63
  delay = 2
64
 
65
+ for attempt in range(retries):
66
+ try:
67
+ logger.info(f"Attempting to get Twilio credentials (attempt {attempt + 1})...")
68
+ credentials = get_twilio_turn_credentials()
69
+ logger.info(f"Twilio credentials response: {credentials}")
70
+ if credentials:
71
+ twilio_available = True
72
+ logger.info("Twilio TURN server available.")
73
+ return True
74
+ except requests.exceptions.RequestException as e:
75
+ logger.warning(f"Attempt {attempt + 1}: {e}")
76
+ logger.warning(traceback.format_exc())
77
+ if attempt < retries - 1:
78
+ time.sleep(delay)
79
+ except Exception as e:
80
+ logger.exception(f"Unexpected error checking Twilio: {e}")
 
 
81
  twilio_available = False
 
82
  return False
83
+
84
+ twilio_available = False
85
+ logger.warning("Twilio TURN server unavailable.")
86
+ return False
 
 
 
 
87
 
88
 
89
 
 
114
  async def video_receive(self, frame: np.ndarray):
115
  if self.session:
116
  try:
 
117
  if time.time() - self.last_frame_time > 1:
118
  self.last_frame_time = time.time()
119
  await self.session.send(encode_image(frame))
 
121
  await self.session.send(encode_image(self.latest_args[2]))
122
  except Exception as e:
123
  logger.error(f"Error sending video frame: {e}")
124
+ gr.Warning("Error sending video to Gemini.")
125
+ self.video_queue.put_nowait(frame)
126
 
127
  async def video_emit(self) -> VideoEmitType:
128
  try:
129
  return await self.video_queue.get()
130
  except asyncio.CancelledError:
131
  logger.info("Video emit cancelled.")
132
+ return None
133
  except Exception as e:
134
  logger.exception(f"Error in video_emit: {e}")
135
  return None
 
140
  try:
141
  client = genai.Client(api_key=api_key, http_options={"api_version": "v1alpha"})
142
  config = {"response_modalities": ["AUDIO"]}
143
+ async with client.aio.live.connect( # Use async with, like the original
144
  model="gemini-2.0-flash-exp", config=config
145
+ ) as session: # <--- Get session from context manager
146
  self.session = session
147
  gemini_connected = True
148
  asyncio.create_task(self.receive_audio())
149
  await self.quit.wait()
150
  except Exception as e:
151
  logger.error(f"Error connecting to Gemini: {e}")
152
+ gemini_connected = False
153
  self.shutdown()
 
154
  gr.Warning(f"Failed to connect to Gemini: {e}")
155
+ finally:
156
+ update_gemini_status_sync()
157
 
158
 
159
  async def generator(self):
160
+ if not self.session:
161
  logger.warning("Gemini session is not initialized.")
162
+ return
163
 
164
  while not self.quit.is_set():
165
  try:
166
+ turn = self.session.receive() # NO await here, like the original
167
  async for response in turn:
168
  if data := response.data:
169
  yield data
170
  except Exception as e:
171
  logger.error(f"Error receiving from Gemini: {e}")
172
+ # NOTE: Do not exit loop.
173
+ # The user may need to say something else
174
+ # if there is a problem.
175
+ # break
176
  async def receive_audio(self):
177
  try:
178
+ # Correctly use the async generator
179
+ async for audio_response in async_aggregate_bytes_to_16bit(self.generator()):
 
180
  self.audio_queue.put_nowait(audio_response)
 
 
181
  except Exception as e:
182
  logger.exception(f"Error in receive_audio: {e}")
183
 
 
184
  async def receive(self, frame: tuple[int, np.ndarray]) -> None:
185
  _, array = frame
186
  array = array.squeeze()
 
190
  await self.session.send(audio_message)
191
  except Exception as e:
192
  logger.error(f"Error sending audio: {e}")
193
+ gr.Warning("Error sending audio to Gemini.")
 
194
 
195
  async def emit(self) -> AudioEmitType:
196
  if not self.args_set.is_set():
197
  await self.wait_for_args()
198
  if self.session is None:
 
199
  asyncio.create_task(self.connect(self.latest_args[1]))
 
 
200
 
201
  try:
202
  array = await self.audio_queue.get()
 
206
  return (self.output_sample_rate, np.array([]))
207
  except Exception as e:
208
  logger.exception(f"Error in emit: {e}")
209
+ return (self.output_sample_rate, np.array([]))
 
210
 
211
  def shutdown(self) -> None:
212
  global gemini_connected
213
+ gemini_connected = False
214
  logger.info("Shutting down GeminiHandler.")
215
  self.quit.set()
216
  self.connection = None
217
  self.args_set.clear()
218
  if self.session:
219
+ # No good async close.
 
220
  pass
221
  self.quit.clear()
222
+ update_gemini_status_sync()
223
+
224
+
225
+ def update_gemini_status_sync():
226
+ """Updates the Gemini status message (synchronous version)."""
227
+ status = "Connected" if gemini_connected else "Disconnected"
228
+ if 'demo' in locals() and demo.running:
229
+ gr.update(value=f"Gemini connection status: {status}")
230
+
231
 
232
 
233
  # --- Gradio UI ---
 
235
  #video-source {max-width: 600px !important; max-height: 600 !important;}
236
  """
237
 
238
+ # Perform Twilio check *before* Gradio UI definition (synchronously)
239
+ if __name__ == "__main__":
240
+ check_twilio_availability_sync()
241
+
242
+
243
+ with gr.Blocks(css=css) as demo:
244
+ gr.HTML(
 
 
 
 
 
 
 
 
 
 
245
  """
246
+ <div style='display: flex; align-items: center; justify-content: center; gap: 20px'>
247
+ <div style="background-color: var(--block-background-fill); border-radius: 8px">
248
+ <img src="https://www.gstatic.com/lamda/images/gemini_favicon_f069958c85030456e93de685481c559f160ea06b.png" style="width: 100px; height: 100px;">
249
+ </div>
250
+ <div>
251
+ <h1>Gen AI SDK Voice Chat</h1>
252
+ <p>Speak with Gemini using real-time audio + video streaming</p>
253
+ <p>Powered by <a href="https://gradio.app/">Gradio</a> and <a href=https://freddyaboulton.github.io/gradio-webrtc/">WebRTC</a>⚡️</p>
254
+ <p>Get an API Key <a href="https://support.google.com/googleapi/answer/6158862?hl=en">here</a></p>
255
+ </div>
256
+ </div>
257
+ """
258
+ )
259
+ twilio_status_message = gr.Markdown("")
260
+ gemini_status_message = gr.Markdown("")
261
+
262
+ with gr.Row() as api_key_row:
263
+ api_key = gr.Textbox(
264
+ label="API Key",
265
+ type="password",
266
+ placeholder="Enter your API Key",
267
+ value=os.getenv("GOOGLE_API_KEY"),
268
  )
269
+ with gr.Row(visible=False) as row:
270
+ with gr.Column():
271
+ # Set rtc_configuration based on the *pre-checked* twilio_available
272
+ rtc_config = get_twilio_turn_credentials() if twilio_available else None
273
+ # Explicitly specify codecs (example - you might need to adjust)
274
+ if rtc_config:
275
+ rtc_config['codecs'] = ['VP8', 'H264'] # Prefer VP8, then H.264
276
+ webrtc = WebRTC(
277
+ label="Video Chat",
278
+ modality="audio-video",
279
+ mode="send-receive",
280
+ elem_id="video-source",
281
+ rtc_configuration=rtc_config,
282
+ icon="https://www.gstatic.com/lamda/images/gemini_favicon_f069958c85030456e93de685481c559f160ea06b.png",
283
+ pulse_color="rgb(35, 157, 225)",
284
+ icon_button_color="rgb(35, 157, 225)",
285
  )
286
+ with gr.Column():
287
+ image_input = gr.Image(label="Image", type="numpy", sources=["upload", "clipboard"])
288
+
289
+
290
+ def update_twilio_status_ui():
291
+ if twilio_available:
292
+ message = "Twilio TURN server available."
293
+ else:
294
+ message = "**Warning:** Twilio TURN server unavailable. Connection might be less reliable."
295
+ return gr.update(value=message)
296
+
297
+ demo.load(update_twilio_status_ui, [], [twilio_status_message])
298
+
299
+ webrtc.stream(
300
+ GeminiHandler(),
301
+ inputs=[webrtc, api_key, image_input],
302
+ outputs=[webrtc],
303
+ time_limit=90,
304
+ concurrency_limit=None,
305
+ )
306
+
307
+
308
+ def check_api_key(api_key_str):
309
+ if not api_key_str:
310
+ return (
311
+ gr.update(visible=True),
312
+ gr.update(visible=False),
313
+ gr.update(value="Please enter a valid API key"),
314
+ gr.update(value=""),
 
 
 
 
 
 
 
 
 
 
 
 
 
315
  )
316
+ return (
317
+ gr.update(visible=False),
318
+ gr.update(visible=True),
319
+ gr.update(value=""),
320
+ gr.update(value=""),
 
 
 
 
 
 
 
 
 
321
  )
322
 
323
+ api_key.submit(
324
+ check_api_key,
325
+ [api_key],
326
+ [api_key_row, row, twilio_status_message, gemini_status_message],
327
+ )
328
 
329
+ demo.launch()