🎥 Video Person Detection & Tracking with ReID

A sophisticated computer vision application that combines YOLOv8, InsightFace, and TorchReID for robust person detection, tracking, and re-identification in videos. The application provides a user-friendly Gradio interface for easy video processing.

🔧 Technology Stack

YOLOv8: Real-time person detection
ByteTrack: Multi-object tracking algorithm
InsightFace: Facial feature extraction for person identification
OSNet: Full-body re-identification features
Gradio: Web-based user interface

📋 Features

Real-time person detection and tracking
Consistent person re-identification across frames
Face and body feature extraction
Interactive web interface
JSON export of tracking data
Support for multiple video formats

🚀 Quick Start

Prerequisites

System Requirements:

Python 3.8 or higher
CUDA-compatible GPU (recommended for better performance)
At least 4GB RAM
2GB free disk space

Platform-Specific Dependencies:

Linux:

# Install g++ compiler (required for InsightFace)
sudo apt-get update
sudo apt-get install g++ build-essential

Windows:

Install Microsoft Visual C++ Redistributable (latest version)
Ensure you have Visual Studio Build Tools or Visual Studio Community installed

macOS:

# Install Xcode command line tools
xcode-select --install

Installation

Clone the repository:

git clone [email protected]:zebshah7851/object-detection-and-tracking.git
cd video-person-tracking

Create a virtual environment:

python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On Linux/macOS:
source venv/bin/activate

Install dependencies:

pip install --upgrade pip
pip install -r requirements.txt

Note: The installation process may take 10-15 minutes due to large model downloads (PyTorch, CUDA libraries, etc.).

Model Setup

The application requires several pre-trained models:

YOLOv8 Detection Model:
- Place your trained detection.pt model file in the project root directory
- Alternatively, the app will download a default YOLOv8 model on first run
InsightFace Model:
- The buffalo_l model will be automatically downloaded on first run
- Requires ~2GB of storage space
TorchReID Model:
- The osnet_x0_25 model will be automatically downloaded
- Pre-trained on Market1501 dataset

Running the Application

Start the Gradio interface:

python app.py

Access the web interface:
- Open your browser and navigate to: http://127.0.0.1:7860
- The interface will load automatically
Process videos:
- Upload a video file (MP4, AVI, MOV, WEBM)
- Click "🚀 Process Video"
- Download the processed video and tracking data

📁 Project Structure

video-person-tracking/
├── app.py                 # Gradio web interface
├── detection.py           # Core detection script
├── requirements.txt       # Python dependencies
├── README.md              # This file
├── outputs/               # Generated output files
├── detection.pt           # YOLOv8 model to detect persons
└── logs/                  # Application logs

🔧 Configuration

Model Parameters

You can adjust the following parameters in app.py:

DETECTION_THRESHOLD = 0.75  # Person detection confidence threshold
SIMILARITY_THRESHOLD = 0.6  # Person re-identification threshold

Performance Optimization

For GPU acceleration:

Ensure CUDA is properly installed
The application automatically detects and uses GPU if available
Monitor GPU memory usage for large videos

For CPU-only systems:

Reduce video resolution before processing
Process shorter video segments
Expect longer processing times

📊 Output Format

Processed Video

Annotated video with bounding boxes
Consistent person IDs across frames
Real-time tracking visualization

JSON Tracking Data

{
  "metadata": {
    "total_frames": 1500,
    "total_people": 5,
    "id_mapping": {...}
  },
  "frames": [
    {
      "frame": 1,
      "people": [
        {
          "person_id": 1,
          "center_x": 320.5,
          "center_y": 240.0,
          "confidence": 0.85,
          "bbox": {"x1": 100, "y1": 50, "x2": 200, "y2": 300}
        }
      ]
    }
  ]
}

🐛 Troubleshooting

Common Issues

Installation Problems:

InsightFace installation fails:
```
# Try installing with specific version
pip install insightface==0.7.3
pip install onnxruntime-gpu==1.14.1
```
If you running linux, you need to install g++. If running on windows, you will need to install latest Visual C++ Redistributions.
Model download issues:
- Check internet connection
- Manually download models if automatic download fails
- Ensure sufficient disk space

Runtime Issues:

Video won't load in browser:
- Try downloading the output video manually
- Check browser compatibility
- Clear browser cache
Slow processing:
- Use GPU acceleration if available
- Reduce detection threshold
- Process shorter video segments
High memory usage:
- Monitor system resources
- Close unnecessary applications
- Use smaller input videos

📝 System Requirements

CPU: Intel i5 or AMD Ryzen 5 (4 cores)
RAM: 8GB
Storage: 5GB free space
GPU: Optional, but recommended for faster processing