| # GPT-4o Media Stream Capture and Analysis | |
| ## Project Overview | |
| This project provides a web application that captures media streams from various sources such as a webcam, desktop, or specific applications. It captures frames at intervals and uses AI to analyze and summarize the frames, providing insights using GPT-4. | |
|  | |
| ### Key Features | |
| - **Media Stream Capture**: Capture video streams from a webcam, screen, or specific applications. | |
| - **Frame Analysis**: Use OpenAI's GPT-4 to analyze captured frames for text, objects, context, and other details. | |
| - **Customizable Prompts**: Customize the prompt used for frame analysis. | |
| - **API Integration**: Integrate with OpenAI's API for frame analysis. | |
| ## Project Structure | |
| - `app.py`: The main server-side application code using Quart. | |
| - `templates/index.html`: The HTML template for the web application. | |
| - `static/script.js`: The client-side JavaScript for handling media streams and interaction with the backend. | |
| ## API Endpoints | |
| - **`GET /`**: Serves the main web application. | |
| - **`POST /process_frame`**: Processes a captured frame and returns the analysis result. | |
| ### `POST /process_frame` | |
| - **Request Body**: | |
| ```json | |
| { | |
| "image": "data:image/jpeg;base64,<base64-encoded-image>", | |
| "prompt": "Analyze this frame", | |
| "api_key": "<OpenAI API Key>" | |
| } | |
| ``` | |
| - **Response**: | |
| ```json | |
| { | |
| "response": "<Analysis result in markdown format>" | |
| } | |
| ``` | |
| ## Potential Uses | |
| - **Remote Monitoring**: Capture and analyze video streams for remote monitoring applications. | |
| - **Educational Purposes**: Use AI to analyze and summarize educational video content. | |
| - **Content Creation**: Automate the analysis and summarization of video content for creators. | |
| ## Customization | |
| - **Prompts**: Customize the analysis prompt via the settings panel in the web application. | |
| - **Refresh Rate**: Adjust the frame capture interval through the settings panel. | |
| - **API Key**: Configure the OpenAI API key via the settings panel. | |
| ## Deployment | |
| 1. **Clone the Repository**: | |
| ```bash | |
| git clone https://github.com/ruvnet/ai-video.git | |
| cd ai-video | |
| ``` | |
| 2. **Install Dependencies**: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 3. **Set Environment Variables**: | |
| ```bash | |
| export OPENAI_API_KEY=<your_openai_api_key> | |
| ``` | |
| 4. **Run the Application**: | |
| ```bash | |
| python app.py | |
| ``` | |
| 5. **Access the Application**: | |
| Open your web browser and navigate to `http://localhost:5000`. | |
| ## `requirements.txt` | |
| ```plaintext | |
| quart | |
| opencv-python-headless | |
| httpx | |
| numpy | |
| ``` | |
| ### API Endpoints | |
| - **`GET /`**: Serves the main web application. | |
| - **`POST /process_frame`**: Processes a captured frame and returns the analysis result. | |
| ### Customization | |
| - Customize prompts and refresh rates via the settings panel in the web application. | |
| - Configure the OpenAI API key via the settings panel. | |
| ## Contributing | |
| Feel free to fork the repository and submit pull requests. For major changes, please open an issue first to discuss what you would like to change. | |
| ## License | |
| [MIT](LICENSE) | |