---
title: MCP Hackathon Deepfake Watchdog
emoji: 🛡️
colorFrom: green
colorTo: gray
sdk: gradio
sdk_version: 5.33.1
app_file: app.py
pinned: false
license: mit
short_description: Upload your image and/or voice to scan for deepfake misuse o
---
# 🛡️ Deepfake Watchdog 🤖

🏷️ Tags: `mcp-server-track` `agent-demo-track`

A multimodal AI agent that detects deepfake content using **face**, **voice**, and **video** verification. Built with 🧠 DeepFace, 🗣️ SpeechBrain, 🎥 OpenCV, and accelerated with ☁️ Modal for serverless GPU-powered execution. Designed for the **MCP HuggingFace's Hackathon**.

![Deepfake Watchdog Banner](https://your-project-banner-url-if-any.com)

---

## 🚀 Features

- 🧑‍🦱 **Face Verification**: Compare two faces and detect if they belong to the same person using DeepFace.
- 🎙️ **Voice Verification**: Determine if two audio samples are from the same speaker with SpeechBrain.
- 🎞️ **Video Scan for Deepfakes**: Scan a video (or YouTube link) and match all faces against a reference image.
- 📄 **PDF Report**: Generate a downloadable report with all scan results.
- 🧠 **Runs as an MCP Agent**: Compatible with [Gradio MCP](https://huggingface.co/docs/mcp), fully pluggable into agentic workflows.
- ⚡ **Modal Offloading**: GPU-backed execution for DeepFace & SpeechBrain using [Modal](https://modal.com).

---

## 🧱 Architecture

Gradio UI ──> MCP Agent ──> Modal Functions
└─> verify_faces_remote()
└─> verify_voices_remote()
└─> verify_faces_in_video_remote()


---

## 🧩 Tech Stack

| Tool         | Purpose                          |
|--------------|----------------------------------|
| `Gradio`     | Frontend UI + MCP integration    |
| `Modal`      | GPU offloading for ML workloads  |
| `DeepFace`   | Face verification                |
| `SpeechBrain`| Voice verification               |
| `OpenCV`     | Video processing                 |
| `Pillow`     | Image manipulation               |
| `TensorFlow` | Required by DeepFace             |
| `FFmpeg`     | (Optional) for YouTube downloads |

---

## 🛠️ Setup Instructions

### 1. Clone the repo

```bash
git clone https://huggingface.co/spaces/abetavarez/MCP-Hackathon-Deepfake-Watchdog
cd MCP-Hackathon-Deepfake-Watchdog
```

### 2. Create .env
```bash
cp .env.example .env
# Add your API keys or configs if required
```

### 3. Install dependencies (locally)
```bash
pip install -r requirements.txt
```

### 4. Run the app (with MCP)
```bash
modal run app.py
```

or for development::
```bash
python app.py
```

## ⚙️ Modal Setup
The app uses a single modal_app/backend.py file that defines:

verify_faces_remote

verify_voices_remote

verify_faces_in_video_remote

Each function is GPU-accelerated and runs independently on demand.

```bash
modal deploy modal_app.py
```

## 🧠 Agent MCP Integration
This app runs as an MCP server using:

```python
demo.launch(mcp_server=True)
```
You can use this tool in your LlamaIndex, Hugging Face or LangChain agent as a remote service.

🏆 Hackathon Tracks
✅ Track 1: MCP Tool — Turned into a reusable tool with mcp_server=True
✅ Track 3: Agentic Demo — A fully capable AI agent for deepfake detection (LlamaIndex)
✅ Modal — Offloaded heavy tasks to Modal’s GPU cloud

## 🧱 Folder Structure
.
├── app.py                # Main app entry point (Gradio + MCP)
├── llama_agent.py        # LlamaIndex Agent implementation
├── tools.py              # Tools for LlamaIndex Agent and MCP
├── modal_app/
│   ├── backend.py        # Modal remote functions
├── detector/
│   ├── face.py           # Local face verification logic
│   ├── voice.py          # Local voice verification logic
│   └── video.py          # Local video scan logic
├── reports/
│   └── pdf_report.py     # PDF report generation
├── utils/
│   └── youtube_utils.py  # YouTube download helper
├── README.md
├── requirements.txt
└── .env.example


##🧑‍💻 Author
Abraham Efren Tavarez
GitHub -  [@AbeTavarez](https://github.com/AbeTavarez)
LinkedIn — [@abrahametavarez](https://www.linkedin.com/in/abrahametavarez/)

## ✨ Future Improvements
 Add S3-compatible cloud storage or video ingest

 Add Nebius S3 support for video uploads

 Add RAG-based evidence summarizer

 Enable auto-scan via background agent

 Add QR code for mobile voice capture