Abraham E. Tavarez
added video link
1e42170
---
title: MCP Hackathon Deepfake Watchdog
emoji: πŸ›‘οΈ
colorFrom: green
colorTo: gray
sdk: gradio
sdk_version: 5.33.1
app_file: app.py
pinned: false
license: mit
short_description: Upload your image and/or voice to scan for deepfake misuse o
---
# πŸ›‘οΈ Deepfake Watchdog πŸ€–
🏷️ Tags: `mcp-server-track` `agent-demo-track`
[Video Link](https://www.dropbox.com/scl/fi/bibsndi4tazikhozreg7m/2025-06-09-23-39-26.mkv?rlkey=mhq1mgbr1wuhr1k48hv49i1ga&st=7hbov0m4&dl=0)
A multimodal AI agent that detects deepfake content using **face**, **voice**, and **video** verification. Built with 🧠 DeepFace, πŸ—£οΈ SpeechBrain, πŸŽ₯ OpenCV, and accelerated with ☁️ Modal for serverless GPU-powered execution. Designed for the **MCP HuggingFace's Hackathon**.
---
## πŸš€ Features
- πŸ§‘β€πŸ¦± **Face Verification**: Compare two faces and detect if they belong to the same person using DeepFace.
- πŸŽ™οΈ **Voice Verification**: Determine if two audio samples are from the same speaker with SpeechBrain.
- 🎞️ **Video Scan for Deepfakes**: Scan a video (or YouTube link) and match all faces against a reference image.
- πŸ“„ **PDF Report**: Generate a downloadable report with all scan results.
- 🧠 **Runs as an MCP Agent**: Compatible with [Gradio MCP](https://huggingface.co/docs/mcp), fully pluggable into agentic workflows.
- ⚑ **Modal Offloading**: GPU-backed execution for DeepFace & SpeechBrain using [Modal](https://modal.com).
---
## 🧱 Architecture
Gradio UI ──> MCP Agent ──> Modal Functions
└─> verify_faces_remote()
└─> verify_voices_remote()
└─> verify_faces_in_video_remote()
---
## 🧩 Tech Stack
| Tool | Purpose |
|--------------|----------------------------------|
| `Gradio` | Frontend UI + MCP integration |
| `Modal` | GPU offloading for ML workloads |
| `DeepFace` | Face verification |
| `SpeechBrain`| Voice verification |
| `OpenCV` | Video processing |
| `Pillow` | Image manipulation |
| `TensorFlow` | Required by DeepFace |
| `FFmpeg` | (Optional) for YouTube downloads |
---
## πŸ› οΈ Setup Instructions
### 1. Clone the repo
```bash
git clone https://huggingface.co/spaces/abetavarez/MCP-Hackathon-Deepfake-Watchdog
cd MCP-Hackathon-Deepfake-Watchdog
```
### 2. Create .env
```bash
cp .env.example .env
# Add your API keys or configs if required
```
### 3. Install dependencies (locally)
```bash
pip install -r requirements.txt
```
### 4. Run the app (with MCP)
```bash
modal run app.py
```
or for development::
```bash
python app.py
```
## βš™οΈ Modal Setup
The app uses a single modal_app/backend.py file that defines:
verify_faces_remote
verify_voices_remote
verify_faces_in_video_remote
Each function is GPU-accelerated and runs independently on demand.
```bash
modal deploy modal_app.py
```
## 🧠 Agent MCP Integration
This app runs as an MCP server using:
```python
demo.launch(mcp_server=True)
```
You can use this tool in your LlamaIndex, Hugging Face or LangChain agent as a remote service.
πŸ† Hackathon Tracks
βœ… Track 1: MCP Tool β€” Turned into a reusable tool with mcp_server=True
βœ… Track 3: Agentic Demo β€” A fully capable AI agent for deepfake detection (LlamaIndex)
βœ… Modal β€” Offloaded heavy tasks to Modal’s GPU cloud
## 🧱 Folder Structure
.
β”œβ”€β”€ app.py # Main app entry point (Gradio + MCP)
β”œβ”€β”€ llama_agent.py # LlamaIndex Agent implementation
β”œβ”€β”€ tools.py # Tools for LlamaIndex Agent and MCP
β”œβ”€β”€ modal_app/
β”‚ β”œβ”€β”€ backend.py # Modal remote functions
β”œβ”€β”€ detector/
β”‚ β”œβ”€β”€ face.py # Local face verification logic
β”‚ β”œβ”€β”€ voice.py # Local voice verification logic
β”‚ └── video.py # Local video scan logic
β”œβ”€β”€ reports/
β”‚ └── pdf_report.py # PDF report generation
β”œβ”€β”€ utils/
β”‚ └── youtube_utils.py # YouTube download helper
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
└── .env.example
##πŸ§‘β€πŸ’» Author
Abraham Efren Tavarez
GitHub - [@AbeTavarez](https://github.com/AbeTavarez)
LinkedIn β€” [@abrahametavarez](https://www.linkedin.com/in/abrahametavarez/)
## ✨ Future Improvements
Add S3-compatible cloud storage or video ingest
Add Nebius S3 support for video uploads
Add RAG-based evidence summarizer
Enable auto-scan via background agent
Add QR code for mobile voice capture