Spaces:

freddyaboulton
/

gemini-audio-video-chat

Running

ahundt commited on Feb 21

Commit

2d3e55a

1 Parent(s): b7a0a78

app.py partly works, could chat with AI, has exit error

2025-02-20 22:25:41,864 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,864 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,864 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,864 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,864 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,865 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
Traceback (most recent call last):
File "/Users/athundt/source/gemini-audio-video-chat/.venv/lib/python3.13/site-packages/gradio/blocks.py", line 2923, in block_thread
time.sleep(0.1)
~~~~~~~~~~^^^^^
2025-02-20 22:25:41,866 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
KeyboardInterrupt
2025-02-20 22:25:41,866 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
2025-02-20 22:25:41,866 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
2025-02-20 22:25:41,866 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
File "/Users/athundt/source/gemini-audio-video-chat/app.py", line 329, in <module>
demo.launch()
~~~~~~~~~~~^^
2025-02-20 22:25:41,867 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
File "/Users/athundt/source/gemini-audio-video-chat/.venv/lib/python3.13/site-packages/gradio/blocks.py", line 2829, in launch
self.block_thread()
~~~~~~~~~~~~~~~~~^^
2025-02-20 22:25:41,867 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
File "/Users/athundt/source/gemini-audio-video-chat/.venv/lib/python3.13/site-packages/gradio/blocks.py", line 2927, in block_thread
self.server.close()
~~~~~~~~~~~~~~~~~^^
2025-02-20 22:25:41,867 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
File "/Users/athundt/source/gemini-audio-video-chat/.venv/lib/python3.13/site-packages/gradio/http_server.py", line 69, in close
self.thread.join(timeout=5)
~~~~~~~~~~~~~~~~^^^^^^^^^^^
2025-02-20 22:25:41,867 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)
File "/usr/local/Cellar/[email protected]/3.13.2/Frameworks/Python.framework/Versions/3.13/lib/python3.13/threading.py", line 1092, in join
self._handle.join(timeout)
~~~~~~~~~~~~~~~~~^^^^^^^^^
KeyboardInterrupt
2025-02-20 22:25:41,867 - ERROR - Error receiving from Gemini: sent 1000 (OK); then received 1000 (OK)

Files changed (1) hide show

app.py +144 -155

app.py CHANGED Viewed

@@ -4,7 +4,7 @@ import os
 import time
 from io import BytesIO
 import logging
-import traceback  # Import traceback
 import gradio as gr
 import numpy as np
@@ -18,30 +18,30 @@ from gradio_webrtc import (
     get_twilio_turn_credentials,
 )
 from PIL import Image
 # --- Setup Logging ---
 logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
 logger = logging.getLogger(__name__)
 # --- Global State ---
-twilio_available = None  # None = not checked, True = available, False = unavailable
-gemini_connected = False  # Track Gemini connection status
-load_complete = asyncio.Event()  # Event to signal demo.load completion
 # --- Helper Functions ---
 def encode_audio(data: np.ndarray) -> dict:
-    """Encode Audio data to send to the server."""
     if not isinstance(data, np.ndarray):
         raise TypeError("encode_audio expected a numpy.ndarray")
     try:
         return {"mime_type": "audio/pcm", "data": base64.b64encode(data.tobytes()).decode("UTF-8")}
     except Exception as e:
         logger.error(f"Error encoding audio: {e}")
-        raise  # Re-raise the exception after logging
 def encode_image(data: np.ndarray) -> dict:
-    """Encode Image data to send to the server."""
     if not isinstance(data, np.ndarray):
         raise TypeError("encode_image expected a numpy.ndarray")
     try:
@@ -55,42 +55,35 @@ def encode_image(data: np.ndarray) -> dict:
         logger.error(f"Error encoding image: {e}")
         raise
-async def check_twilio_availability() -> bool:
-    """Checks Twilio TURN server availability with retries and timeout."""
     global twilio_available
     timeout = 10
     retries = 3
     delay = 2
-    try:
-        async with asyncio.timeout(timeout):
-            for attempt in range(retries):
-                try:
-                    # VERY DETAILED LOGGING HERE
-                    logger.info(f"Attempting to get Twilio credentials (attempt {attempt + 1})...")
-                    credentials = get_twilio_turn_credentials()
-                    logger.info(f"Twilio credentials response: {credentials}")  # Log the response
-                    if credentials:
-                        twilio_available = True
-                        logger.info("Twilio TURN server available.")
-                        return True
-                except Exception as e:
-                    logger.warning(f"Attempt {attempt + 1} to get Twilio credentials failed: {e}")
-                    # Print the full traceback
-                    logger.warning(traceback.format_exc())
-                    if attempt < retries - 1:
-                        await asyncio.sleep(delay)
             twilio_available = False
-            logger.warning("Twilio TURN server unavailable after multiple attempts.")
             return False
-    except asyncio.TimeoutError:
-        twilio_available = False
-        logger.error(f"Twilio TURN server check timed out after {timeout} seconds.")
-        return False
-    except Exception as e:
-        twilio_available = False
-        logger.exception(f"Unexpected error checking Twilio availability: {e}")
-        return False
@@ -121,7 +114,6 @@ class GeminiHandler(AsyncAudioVideoStreamHandler):
     async def video_receive(self, frame: np.ndarray):
         if self.session:
             try:
-                # send image every 1 second
                 if time.time() - self.last_frame_time > 1:
                     self.last_frame_time = time.time()
                     await self.session.send(encode_image(frame))
@@ -129,15 +121,15 @@ class GeminiHandler(AsyncAudioVideoStreamHandler):
                         await self.session.send(encode_image(self.latest_args[2]))
             except Exception as e:
                 logger.error(f"Error sending video frame: {e}")
-                gr.Warning("Error sending video to Gemini. Check your connection and API key.")
-        self.video_queue.put_nowait(frame)  # Always put the frame in the queue
     async def video_emit(self) -> VideoEmitType:
         try:
             return await self.video_queue.get()
         except asyncio.CancelledError:
             logger.info("Video emit cancelled.")
-            return None # Or some other default value
         except Exception as e:
             logger.exception(f"Error in video_emit: {e}")
             return None
@@ -148,51 +140,47 @@ class GeminiHandler(AsyncAudioVideoStreamHandler):
             try:
                 client = genai.Client(api_key=api_key, http_options={"api_version": "v1alpha"})
                 config = {"response_modalities": ["AUDIO"]}
-                async with client.aio.live.connect(
                     model="gemini-2.0-flash-exp", config=config
-                ) as session:
                     self.session = session
                     gemini_connected = True
                     asyncio.create_task(self.receive_audio())
                     await self.quit.wait()
             except Exception as e:
                 logger.error(f"Error connecting to Gemini: {e}")
-                gemini_connected = False  # Set connection status to False
                 self.shutdown()
-                # Display error in the UI
                 gr.Warning(f"Failed to connect to Gemini: {e}")
-            finally:  # Update UI *after* connection attempt (both success and failure)
-                gr.Info(f"Gemini connection status: {'Connected' if gemini_connected else 'Disconnected'}")
     async def generator(self):
-        if not self.session:  # Check if session exists
             logger.warning("Gemini session is not initialized.")
-            return  # Or raise an exception, depending on desired behavior
         while not self.quit.is_set():
             try:
-                turn = await self.session.receive()
                 async for response in turn:
                     if data := response.data:
                         yield data
             except Exception as e:
                 logger.error(f"Error receiving from Gemini: {e}")
-                gr.Warning("Error communicating with Gemini.  Check network and API key.")
-                break # Exit the loop on error
     async def receive_audio(self):
         try:
-            async for audio_response in async_aggregate_bytes_to_16bit(
-                self.generator()
-            ):
                 self.audio_queue.put_nowait(audio_response)
-        except asyncio.CancelledError:
-            logger.info("Audio receive cancelled.")
         except Exception as e:
             logger.exception(f"Error in receive_audio: {e}")
     async def receive(self, frame: tuple[int, np.ndarray]) -> None:
         _, array = frame
         array = array.squeeze()
@@ -202,17 +190,13 @@ class GeminiHandler(AsyncAudioVideoStreamHandler):
                 await self.session.send(audio_message)
         except Exception as e:
             logger.error(f"Error sending audio: {e}")
-            gr.Warning("Error sending audio to Gemini. Check your connection and API key.")
     async def emit(self) -> AudioEmitType:
         if not self.args_set.is_set():
             await self.wait_for_args()
         if self.session is None:
-          try:
             asyncio.create_task(self.connect(self.latest_args[1]))
-          except Exception as e:
-                logger.error(f"emit error connecting: {e}")
         try:
             array = await self.audio_queue.get()
@@ -222,22 +206,28 @@ class GeminiHandler(AsyncAudioVideoStreamHandler):
             return (self.output_sample_rate, np.array([]))
         except Exception as e:
             logger.exception(f"Error in emit: {e}")
-            return (self.output_sample_rate, np.array([]))  # Return empty array on error
     def shutdown(self) -> None:
         global gemini_connected
-        gemini_connected = False # Reset on shutdown
         logger.info("Shutting down GeminiHandler.")
         self.quit.set()
         self.connection = None
         self.args_set.clear()
         if self.session:
-             # No good async close method, this can get stuck.
-            #  asyncio.create_task(self.session.close())
             pass
         self.quit.clear()
-        gr.Info("Gemini connection closed.")
 # --- Gradio UI ---
@@ -245,96 +235,95 @@ css = """
 #video-source {max-width: 600px !important; max-height: 600 !important;}
 """
-async def main():
-    global twilio_available, gemini_connected
-    with gr.Blocks(css=css) as demo:
-        gr.HTML(
-            """
-        <div style='display: flex; align-items: center; justify-content: center; gap: 20px'>
-            <div style="background-color: var(--block-background-fill); border-radius: 8px">
-                <img src="https://www.gstatic.com/lamda/images/gemini_favicon_f069958c85030456e93de685481c559f160ea06b.png" style="width: 100px; height: 100px;">
-            </div>
-            <div>
-                <h1>Gen AI SDK Voice Chat</h1>
-                <p>Speak with Gemini using real-time audio + video streaming</p>
-                <p>Powered by <a href="https://gradio.app/">Gradio</a> and <a href=https://freddyaboulton.github.io/gradio-webrtc/">WebRTC</a>⚡️</p>
-                <p>Get an API Key <a href="https://support.google.com/googleapi/answer/6158862?hl=en">here</a></p>
-            </div>
-        </div>
         """
         )
-        twilio_status_message = gr.Markdown("")  # For displaying Twilio status
-        gemini_status_message = gr.Markdown("")  # For Gemini status
-        with gr.Row() as api_key_row:
-            api_key = gr.Textbox(
-                label="API Key",
-                type="password",
-                placeholder="Enter your API Key",
-                value=os.getenv("GOOGLE_API_KEY"),
             )
-        with gr.Row(visible=False) as row:
-            with gr.Column():
-                webrtc = WebRTC(
-                    label="Video Chat",
-                    modality="audio-video",
-                    mode="send-receive",
-                    elem_id="video-source",
-                    rtc_configuration={"iceServers": []},  # DUMMY CONFIGURATION
-                    icon="https://www.gstatic.com/lamda/images/gemini_favicon_f069958c85030456e93de685481c559f160ea06b.png",
-                    pulse_color="rgb(35, 157, 225)",
-                    icon_button_color="rgb(35, 157, 225)",
-                )
-            with gr.Column():
-                image_input = gr.Image(label="Image", type="numpy", sources=["upload", "clipboard"])
-        async def update_twilio_status_and_ui():
-            """Updates Twilio status and UI elements."""
-            await check_twilio_availability()  # Check Twilio availability
-            if twilio_available:
-                rtc_config = get_twilio_turn_credentials()
-                message = "Twilio TURN server available.  Connection should be reliable."
-            else:
-                rtc_config = None
-                message = "**Warning:** Twilio TURN server unavailable.  Connection might be less reliable or fail if you are behind a symmetric NAT."
-            load_complete.set()  # Signal that load is complete - *before* returning
-            return gr.update(rtc_configuration=rtc_config), gr.update(value=message)
-        # Check Twilio availability and update UI on startup.
-        demo.load(update_twilio_status_and_ui, [], [webrtc, twilio_status_message])
-        async def start_streaming():
-            """Starts the WebRTC streaming after load_complete is set."""
-            await load_complete.wait()  # *Wait* for load to complete
-            await asyncio.sleep(0.1)     # Small delay (optional, but can help)
-            webrtc.stream(
-                GeminiHandler(),
-                inputs=[webrtc, api_key, image_input],
-                outputs=[webrtc],
-                time_limit=90,
-                concurrency_limit=None,  # Removed concurrency limit
             )
-        # Use .then() to chain start_streaming *after* demo.load
-        demo.load(None, [], []).then(start_streaming, [], [])
-        def check_api_key(api_key_str):
-            if not api_key_str:
-                return gr.update(visible=True), gr.update(visible=False), gr.update(value="Please enter a valid API key")
-            return gr.update(visible=False), gr.update(visible=True), gr.update(value="")
-        api_key.submit(
-            check_api_key,
-            [api_key],
-            [api_key_row, row, twilio_status_message],
         )
-    demo.launch()
-if __name__ == "__main__":
-    asyncio.run(main())

 import time
 from io import BytesIO
 import logging
+import traceback
 import gradio as gr
 import numpy as np
     get_twilio_turn_credentials,
 )
 from PIL import Image
+import requests  # Use requests for synchronous Twilio check
 # --- Setup Logging ---
 logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
 logger = logging.getLogger(__name__)
 # --- Global State ---
+twilio_available = None  # Will be set *before* Gradio initialization
+gemini_connected = False
 # --- Helper Functions ---
 def encode_audio(data: np.ndarray) -> dict:
+    """Encode Audio data."""
     if not isinstance(data, np.ndarray):
         raise TypeError("encode_audio expected a numpy.ndarray")
     try:
         return {"mime_type": "audio/pcm", "data": base64.b64encode(data.tobytes()).decode("UTF-8")}
     except Exception as e:
         logger.error(f"Error encoding audio: {e}")
+        raise
 def encode_image(data: np.ndarray) -> dict:
+    """Encode Image data."""
     if not isinstance(data, np.ndarray):
         raise TypeError("encode_image expected a numpy.ndarray")
     try:
         logger.error(f"Error encoding image: {e}")
         raise
+def check_twilio_availability_sync() -> bool:
+    """Checks Twilio TURN server availability (synchronous version)."""
     global twilio_available
     timeout = 10
     retries = 3
     delay = 2
+    for attempt in range(retries):
+        try:
+            logger.info(f"Attempting to get Twilio credentials (attempt {attempt + 1})...")
+            credentials = get_twilio_turn_credentials()
+            logger.info(f"Twilio credentials response: {credentials}")
+            if credentials:
+                twilio_available = True
+                logger.info("Twilio TURN server available.")
+                return True
+        except requests.exceptions.RequestException as e:
+            logger.warning(f"Attempt {attempt + 1}: {e}")
+            logger.warning(traceback.format_exc())
+            if attempt < retries - 1:
+                time.sleep(delay)
+        except Exception as e:
+            logger.exception(f"Unexpected error checking Twilio: {e}")
             twilio_available = False
             return False
+    twilio_available = False
+    logger.warning("Twilio TURN server unavailable.")
+    return False
     async def video_receive(self, frame: np.ndarray):
         if self.session:
             try:
                 if time.time() - self.last_frame_time > 1:
                     self.last_frame_time = time.time()
                     await self.session.send(encode_image(frame))
                         await self.session.send(encode_image(self.latest_args[2]))
             except Exception as e:
                 logger.error(f"Error sending video frame: {e}")
+                gr.Warning("Error sending video to Gemini.")
+        self.video_queue.put_nowait(frame)
     async def video_emit(self) -> VideoEmitType:
         try:
             return await self.video_queue.get()
         except asyncio.CancelledError:
             logger.info("Video emit cancelled.")
+            return None
         except Exception as e:
             logger.exception(f"Error in video_emit: {e}")
             return None
             try:
                 client = genai.Client(api_key=api_key, http_options={"api_version": "v1alpha"})
                 config = {"response_modalities": ["AUDIO"]}
+                async with client.aio.live.connect(  # Use async with, like the original
                     model="gemini-2.0-flash-exp", config=config
+                ) as session:  # <--- Get session from context manager
                     self.session = session
                     gemini_connected = True
                     asyncio.create_task(self.receive_audio())
                     await self.quit.wait()
             except Exception as e:
                 logger.error(f"Error connecting to Gemini: {e}")
+                gemini_connected = False
                 self.shutdown()
                 gr.Warning(f"Failed to connect to Gemini: {e}")
+            finally:
+                update_gemini_status_sync()
     async def generator(self):
+        if not self.session:
             logger.warning("Gemini session is not initialized.")
+            return
         while not self.quit.is_set():
             try:
+                turn = self.session.receive()  # NO await here, like the original
                 async for response in turn:
                     if data := response.data:
                         yield data
             except Exception as e:
                 logger.error(f"Error receiving from Gemini: {e}")
+                # NOTE: Do not exit loop.
+                # The user may need to say something else
+                # if there is a problem.
+                # break
     async def receive_audio(self):
         try:
+            # Correctly use the async generator
+            async for audio_response in async_aggregate_bytes_to_16bit(self.generator()):
                 self.audio_queue.put_nowait(audio_response)
         except Exception as e:
             logger.exception(f"Error in receive_audio: {e}")
     async def receive(self, frame: tuple[int, np.ndarray]) -> None:
         _, array = frame
         array = array.squeeze()
                 await self.session.send(audio_message)
         except Exception as e:
             logger.error(f"Error sending audio: {e}")
+            gr.Warning("Error sending audio to Gemini.")
     async def emit(self) -> AudioEmitType:
         if not self.args_set.is_set():
             await self.wait_for_args()
         if self.session is None:
             asyncio.create_task(self.connect(self.latest_args[1]))
         try:
             array = await self.audio_queue.get()
             return (self.output_sample_rate, np.array([]))
         except Exception as e:
             logger.exception(f"Error in emit: {e}")
+            return (self.output_sample_rate, np.array([]))
     def shutdown(self) -> None:
         global gemini_connected
+        gemini_connected = False
         logger.info("Shutting down GeminiHandler.")
         self.quit.set()
         self.connection = None
         self.args_set.clear()
         if self.session:
+            # No good async close.
             pass
         self.quit.clear()
+        update_gemini_status_sync()
+def update_gemini_status_sync():
+    """Updates the Gemini status message (synchronous version)."""
+    status = "Connected" if gemini_connected else "Disconnected"
+    if 'demo' in locals() and demo.running:
+        gr.update(value=f"Gemini connection status: {status}")
 # --- Gradio UI ---
 #video-source {max-width: 600px !important; max-height: 600 !important;}
 """
+# Perform Twilio check *before* Gradio UI definition (synchronously)
+if __name__ == "__main__":
+    check_twilio_availability_sync()
+with gr.Blocks(css=css) as demo:
+    gr.HTML(
         """
+    <div style='display: flex; align-items: center; justify-content: center; gap: 20px'>
+        <div style="background-color: var(--block-background-fill); border-radius: 8px">
+            <img src="https://www.gstatic.com/lamda/images/gemini_favicon_f069958c85030456e93de685481c559f160ea06b.png" style="width: 100px; height: 100px;">
+        </div>
+        <div>
+            <h1>Gen AI SDK Voice Chat</h1>
+            <p>Speak with Gemini using real-time audio + video streaming</p>
+            <p>Powered by <a href="https://gradio.app/">Gradio</a> and <a href=https://freddyaboulton.github.io/gradio-webrtc/">WebRTC</a>⚡️</p>
+            <p>Get an API Key <a href="https://support.google.com/googleapi/answer/6158862?hl=en">here</a></p>
+        </div>
+    </div>
+    """
+    )
+    twilio_status_message = gr.Markdown("")
+    gemini_status_message = gr.Markdown("")
+    with gr.Row() as api_key_row:
+        api_key = gr.Textbox(
+            label="API Key",
+            type="password",
+            placeholder="Enter your API Key",
+            value=os.getenv("GOOGLE_API_KEY"),
         )
+    with gr.Row(visible=False) as row:
+        with gr.Column():
+            # Set rtc_configuration based on the *pre-checked* twilio_available
+            rtc_config = get_twilio_turn_credentials() if twilio_available else None
+            # Explicitly specify codecs (example - you might need to adjust)
+            if rtc_config:
+                rtc_config['codecs'] = ['VP8', 'H264']  # Prefer VP8, then H.264
+            webrtc = WebRTC(
+                label="Video Chat",
+                modality="audio-video",
+                mode="send-receive",
+                elem_id="video-source",
+                rtc_configuration=rtc_config,
+                icon="https://www.gstatic.com/lamda/images/gemini_favicon_f069958c85030456e93de685481c559f160ea06b.png",
+                pulse_color="rgb(35, 157, 225)",
+                icon_button_color="rgb(35, 157, 225)",
             )
+        with gr.Column():
+            image_input = gr.Image(label="Image", type="numpy", sources=["upload", "clipboard"])
+    def update_twilio_status_ui():
+        if twilio_available:
+            message = "Twilio TURN server available."
+        else:
+            message = "**Warning:** Twilio TURN server unavailable. Connection might be less reliable."
+        return gr.update(value=message)
+    demo.load(update_twilio_status_ui, [], [twilio_status_message])
+    webrtc.stream(
+        GeminiHandler(),
+        inputs=[webrtc, api_key, image_input],
+        outputs=[webrtc],
+        time_limit=90,
+        concurrency_limit=None,
+    )
+    def check_api_key(api_key_str):
+        if not api_key_str:
+            return (
+                gr.update(visible=True),
+                gr.update(visible=False),
+                gr.update(value="Please enter a valid API key"),
+                gr.update(value=""),
             )
+        return (
+            gr.update(visible=False),
+            gr.update(visible=True),
+            gr.update(value=""),
+            gr.update(value=""),
         )
+    api_key.submit(
+        check_api_key,
+        [api_key],
+        [api_key_row, row, twilio_status_message, gemini_status_message],
+    )
+demo.launch()