A newer version of the Gradio SDK is available:
5.35.0
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
Stashface is a Python-based face recognition application that identifies performers in images using ensemble machine learning models. It provides a Gradio web interface for uploading images and searching for performer matches against a database of known performers.
Common Commands
Installation and Setup
uv install # Install dependencies using uv package manager
Running the Application
python app.py # Launch the Gradio web interface on localhost:7860
Testing
pytest tests/ # Run all tests
pytest tests/test_vtt_parser.py # Run specific test file
Environment Variables
DEEPFACE_HOME
: Set to "." (current directory) for DeepFace model storageCUDA_VISIBLE_DEVICES
: Set to "-1" to force CPU usageVISAGE_KEY
: Required for decrypting performer database in persons.zip
Architecture
Core Components
DataManager (
models/data_manager.py
): Handles loading and querying face recognition data- Manages encrypted performer database (
data/persons.zip
) - Loads face embeddings from JSON (
data/faces.json
) - Handles Voyager vector indices for FaceNet and ArcFace models
- Manages encrypted performer database (
EnsembleFaceRecognition (
models/face_recognition.py
): Implements ensemble face recognition- Combines FaceNet512 and ArcFace models using weighted voting
- Normalizes distances and computes confidence scores
- Uses DeepFace backend for face detection and embedding extraction
WebInterface (
web/interface.py
): Gradio-based web interface- Two main tabs: Multiple Face Search and Faces in Sprite
- Handles image uploads and displays JSON results
- Integrates with image processing pipeline
Image Processing (
models/image_processor.py
): Core image processing logic- Extracts faces using YOLOv8 and MediaPipe detectors
- Generates embeddings for original and horizontally flipped images
- Returns performer information with confidence scores
Data Flow
- User uploads image through Gradio interface
- Face detection extracts individual faces from image
- Face embeddings generated using ensemble models (FaceNet + ArcFace)
- Embeddings queried against Voyager vector indices
- Results ranked by confidence and returned with performer metadata
Key Dependencies
- DeepFace: Face recognition and embedding extraction
- Gradio: Web interface framework
- Voyager: Vector similarity search indices
- MediaPipe: Alternative face detection backend
- PyZipper: Encrypted ZIP file handling for performer database
- UV: Modern Python package manager
File Structure
stashface/
βββ app.py # Main application entry point
βββ data/ # Face recognition data files
β βββ faces.json # Face metadata
β βββ persons.zip # Encrypted performer database
β βββ *.voy # Voyager vector indices
βββ models/ # Core ML models and data handling
β βββ data_manager.py # Data loading and querying
β βββ face_recognition.py # Ensemble face recognition
β βββ image_processor.py # Image processing pipeline
βββ web/ # Web interface
β βββ interface.py # Gradio interface
βββ utils/ # Utility functions
β βββ vtt_parser.py # VTT file parsing for video sprites
βββ tests/ # Test files
Development Notes
- The application uses CPU-only inference (CUDA disabled via environment variable)
- Face embeddings are averaged between original and horizontally flipped images for better accuracy
- The performer database is encrypted and requires the
VISAGE_KEY
environment variable - Vector indices use E4M3 storage format for memory efficiency
- The ensemble approach combines FaceNet512 and ArcFace models with equal weighting (1.0 each)