Spaces:

ParsaKhaz
/

promptable-content-moderation

Runtime error

App Files Files Community

ParsaKhaz commited on Feb 20

Commit

aa305e2

verified ·

1 Parent(s): 2db60a3

Upload folder using huggingface_hub

Browse files

Files changed (9) hide show

.github/workflows/update_space.yml +28 -28
.gitignore +51 -51
README.md +165 -165
deep_sort_integration.py +72 -72
packages.txt +1 -1
persistence.py +38 -38
requirements.txt +25 -25
video_visualization.py +329 -329
visualization.py +97 -97

.github/workflows/update_space.yml CHANGED Viewed

@@ -1,28 +1,28 @@
-name: Run Python script
-on:
-  push:
-    branches:
-      - main
-jobs:
-  build:
-    runs-on: ubuntu-latest
-    steps:
-    - name: Checkout
-      uses: actions/checkout@v2
-    - name: Set up Python
-      uses: actions/setup-python@v2
-      with:
-        python-version: '3.9'
-    - name: Install Gradio
-      run: python -m pip install gradio
-    - name: Log in to Hugging Face
-      run: python -c 'import huggingface_hub; huggingface_hub.login(token="${{ secrets.hf_token }}")'
-    - name: Deploy to Spaces
-      run: gradio deploy

+name: Run Python script
+on:
+  push:
+    branches:
+      - main
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    steps:
+    - name: Checkout
+      uses: actions/checkout@v2
+    - name: Set up Python
+      uses: actions/setup-python@v2
+      with:
+        python-version: '3.9'
+    - name: Install Gradio
+      run: python -m pip install gradio
+    - name: Log in to Hugging Face
+      run: python -c 'import huggingface_hub; huggingface_hub.login(token="${{ secrets.hf_token }}")'
+    - name: Deploy to Spaces
+      run: gradio deploy

.gitignore CHANGED Viewed

@@ -1,52 +1,52 @@
-# Python
-__pycache__/
-*.py[cod]
-*$py.class
-*.so
-.Python
-build/
-develop-eggs/
-dist/
-downloads/
-eggs/
-.eggs/
-lib/
-lib64/
-parts/
-sdist/
-var/
-wheels/
-*.egg-info/
-.installed.cfg
-*.egg
-*.dll
-# Virtual Environment
-venv/
-env/
-ENV/
-.venv/
-# IDE
-.idea/
-.vscode/
-*.swp
-*.swo
-# Project specific
-inputs/*
-outputs/*
-!inputs/.gitkeep
-!outputs/.gitkeep
-inputs/
-outputs/
-# Model files
-*.pth
-*.onnx
-*.pt
-# Logs
-*.log
 certificate.pem

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+*.dll
+# Virtual Environment
+venv/
+env/
+ENV/
+.venv/
+# IDE
+.idea/
+.vscode/
+*.swp
+*.swo
+# Project specific
+inputs/*
+outputs/*
+!inputs/.gitkeep
+!outputs/.gitkeep
+inputs/
+outputs/
+# Model files
+*.pth
+*.onnx
+*.pt
+# Logs
+*.log
 certificate.pem

README.md CHANGED Viewed

@@ -1,165 +1,165 @@
----
-title: promptable-content-moderation
-app_file: app.py
-sdk: gradio
-sdk_version: 5.16.1
----
-# Promptable Content Moderation with Moondream
-Welcome to the future of content moderation with Moondream 2B, a powerful and lightweight vision-language model that enables detection and moderation of video content using natural language prompts.
-[Try it now.](https://huggingface.co/spaces/moondream/content-moderation)
-## Features
-- Content moderation through natural language prompts
-- Multiple visualization styles
-- Intelligent scene detection and tracking:
-  - DeepSORT tracking with scene-aware reset
-  - Persistent moderation across frames
-  - Smart tracker reset at scene boundaries
-- Optional grid-based detection for improved accuracy on complex scenes
-- Frame-by-frame processing with IoU-based merging
-- Web-compatible output format
-- Test mode (process only first X seconds)
-- Advanced moderation analysis with multiple visualization plots
-## Examples
-| Example Outputs |
-|------|
-| ![Demo](./examples/clip-cig.gif)     |
-| ![Demo](./examples/clip-gu.gif)      |
-| ![Demo](./examples/clip-conflag.gif) |
-## Requirements
-### Python Dependencies
-For Windows users, before installing other requirements, first install PyTorch with CUDA support:
-```bash
-pip install torch==2.5.1+cu121 torchvision==0.20.1+cu121 --index-url https://download.pytorch.org/whl/cu121
-```
-Then install the remaining dependencies:
-```bash
-pip install -r requirements.txt
-```
-### System Requirements
-- FFmpeg (required for video processing)
-- libvips (required for image processing)
-Installation by platform:
-- Ubuntu/Debian: `sudo apt-get install ffmpeg libvips`
-- macOS: `brew install ffmpeg libvips`
-- Windows:
-  - Download FFmpeg from [ffmpeg.org](https://ffmpeg.org/download.html)
-  - Follow [libvips Windows installation guide](https://docs.moondream.ai/quick-start)
-## Installation
-1. Clone this repository and create a new virtual environment:
-```bash
-git clone https://github.com/vikhyat/moondream/blob/main/recipes/promptable-video-redaction
-python -m venv .venv
-source .venv/bin/activate  # On Windows: .venv\Scripts\activate
-```
-2. Install Python dependencies:
-```bash
-pip install -r requirements.txt
-```
-3. Install ffmpeg and libvips:
-   - On Ubuntu/Debian: `sudo apt-get install ffmpeg libvips`
-   - On macOS: `brew install ffmpeg`
-   - On Windows: Download from [ffmpeg.org](https://ffmpeg.org/download.html)
-> Downloading libvips for Windows requires some additional steps, see [here](https://docs.moondream.ai/quick-start)
-## Usage
-The easiest way to use this tool is through its web interface, which provides a user-friendly experience for video content moderation.
-### Web Interface
-1. Start the web interface:
-```bash
-python app.py
-```
-2. Open the provided URL in your browser (typically <http://localhost:7860>)
-3. Use the interface to:
-   - Upload your video file
-   - Specify content to moderate (e.g., "face", "cigarette", "gun")
-   - Choose redaction style (default: obfuscated-pixel)
-   - OPTIONAL: Configure advanced settings
-     - Processing speed/quality
-     - Grid size for detection
-     - Test mode for quick validation (default: on, 3 seconds)
-   - Process the video and download results
-   - Analyze detection patterns with visualization tools
-## Output Files
-The tool generates two types of output files in the `outputs` directory:
-1. Processed Videos:
-   - Format: `[style]_[content_type]_[original_filename].mp4`
-   - Example: `censor_inappropriate_video.mp4`
-2. Detection Data:
-   - Format: `[style]_[content_type]_[original_filename]_detections.json`
-   - Contains frame-by-frame detection information
-   - Used for visualization and analysis
-## Technical Details
-### Scene Detection and Tracking
-The tool uses advanced scene detection and object tracking:
-1. Scene Detection:
-   - Powered by PySceneDetect's ContentDetector
-   - Automatically identifies scene changes in videos
-   - Configurable detection threshold (default: 30.0)
-   - Helps maintain tracking accuracy across scene boundaries
-2. Object Tracking:
-   - DeepSORT tracking for consistent object identification
-   - Automatic tracker reset at scene changes
-   - Maintains object identity within scenes
-   - Prevents tracking errors across scene boundaries
-3. Integration Benefits:
-   - More accurate object tracking
-   - Better handling of scene transitions
-   - Reduced false positives in tracking
-   - Improved tracking consistency
-## Best Practices
-- Use test mode for initial configuration
-- Enable grid-based detection for complex scenes
-- Choose appropriate redaction style based on content type:
-  - Censor: Complete content blocking
-  - Blur styles: Less intrusive moderation
-  - Bounding Box: Content review and analysis
-- Monitor system resources during processing
-- Use appropriate processing quality settings based on your needs
-## Notes
-- Processing time depends on video length, resolution, GPU availability, and chosen settings
-- GPU is strongly recommended for faster processing
-- Grid-based detection increases accuracy but requires more processing time (each grid cell is processed independently)
-- Test mode processes only first X seconds (default: 3 seconds) for quick validation

+---
+title: promptable-content-moderation
+app_file: app.py
+sdk: gradio
+sdk_version: 5.16.1
+---
+# Promptable Content Moderation with Moondream
+Welcome to the future of content moderation with Moondream 2B, a powerful and lightweight vision-language model that enables detection and moderation of video content using natural language prompts.
+[Try it now.](https://huggingface.co/spaces/moondream/content-moderation)
+## Features
+- Content moderation through natural language prompts
+- Multiple visualization styles
+- Intelligent scene detection and tracking:
+  - DeepSORT tracking with scene-aware reset
+  - Persistent moderation across frames
+  - Smart tracker reset at scene boundaries
+- Optional grid-based detection for improved accuracy on complex scenes
+- Frame-by-frame processing with IoU-based merging
+- Web-compatible output format
+- Test mode (process only first X seconds)
+- Advanced moderation analysis with multiple visualization plots
+## Examples
+| Example Outputs | Prompt                |
+|------------------|----------------------|
+| "white cigarette"   | ![Demo](./examples/clip-cig.gif)     |
+| "gun"         | ![Demo](./examples/clip-gu.gif)      |
+| "confederate flag" | ![Demo](./examples/clip-conflag.gif) |
+## Requirements
+### Python Dependencies
+For Windows users, before installing other requirements, first install PyTorch with CUDA support:
+```bash
+pip install torch==2.5.1+cu121 torchvision==0.20.1+cu121 --index-url https://download.pytorch.org/whl/cu121
+```
+Then install the remaining dependencies:
+```bash
+pip install -r requirements.txt
+```
+### System Requirements
+- FFmpeg (required for video processing)
+- libvips (required for image processing)
+Installation by platform:
+- Ubuntu/Debian: `sudo apt-get install ffmpeg libvips`
+- macOS: `brew install ffmpeg libvips`
+- Windows:
+  - Download FFmpeg from [ffmpeg.org](https://ffmpeg.org/download.html)
+  - Follow [libvips Windows installation guide](https://docs.moondream.ai/quick-start)
+## Installation
+1. Clone this repository and create a new virtual environment:
+```bash
+git clone https://github.com/vikhyat/moondream/blob/main/recipes/promptable-video-redaction
+python -m venv .venv
+source .venv/bin/activate  # On Windows: .venv\Scripts\activate
+```
+2. Install Python dependencies:
+```bash
+pip install -r requirements.txt
+```
+3. Install ffmpeg and libvips:
+   - On Ubuntu/Debian: `sudo apt-get install ffmpeg libvips`
+   - On macOS: `brew install ffmpeg`
+   - On Windows: Download from [ffmpeg.org](https://ffmpeg.org/download.html)
+> Downloading libvips for Windows requires some additional steps, see [here](https://docs.moondream.ai/quick-start)
+## Usage
+The easiest way to use this tool is through its web interface, which provides a user-friendly experience for video content moderation.
+### Web Interface
+1. Start the web interface:
+```bash
+python app.py
+```
+2. Open the provided URL in your browser (typically <http://localhost:7860>)
+3. Use the interface to:
+   - Upload your video file
+   - Specify content to moderate (e.g., "face", "cigarette", "gun")
+   - Choose redaction style (default: obfuscated-pixel)
+   - OPTIONAL: Configure advanced settings
+     - Processing speed/quality
+     - Grid size for detection
+     - Test mode for quick validation (default: on, 3 seconds)
+   - Process the video and download results
+   - Analyze detection patterns with visualization tools
+## Output Files
+The tool generates two types of output files in the `outputs` directory:
+1. Processed Videos:
+   - Format: `[style]_[content_type]_[original_filename].mp4`
+   - Example: `censor_inappropriate_video.mp4`
+2. Detection Data:
+   - Format: `[style]_[content_type]_[original_filename]_detections.json`
+   - Contains frame-by-frame detection information
+   - Used for visualization and analysis
+## Technical Details
+### Scene Detection and Tracking
+The tool uses advanced scene detection and object tracking:
+1. Scene Detection:
+   - Powered by PySceneDetect's ContentDetector
+   - Automatically identifies scene changes in videos
+   - Configurable detection threshold (default: 30.0)
+   - Helps maintain tracking accuracy across scene boundaries
+2. Object Tracking:
+   - DeepSORT tracking for consistent object identification
+   - Automatic tracker reset at scene changes
+   - Maintains object identity within scenes
+   - Prevents tracking errors across scene boundaries
+3. Integration Benefits:
+   - More accurate object tracking
+   - Better handling of scene transitions
+   - Reduced false positives in tracking
+   - Improved tracking consistency
+## Best Practices
+- Use test mode for initial configuration
+- Enable grid-based detection for complex scenes
+- Choose appropriate redaction style based on content type:
+  - Censor: Complete content blocking
+  - Blur styles: Less intrusive moderation
+  - Bounding Box: Content review and analysis
+- Monitor system resources during processing
+- Use appropriate processing quality settings based on your needs
+## Notes
+- Processing time depends on video length, resolution, GPU availability, and chosen settings
+- GPU is strongly recommended for faster processing
+- Grid-based detection increases accuracy but requires more processing time (each grid cell is processed independently)
+- Test mode processes only first X seconds (default: 3 seconds) for quick validation

deep_sort_integration.py CHANGED Viewed

@@ -1,73 +1,73 @@
-import numpy as np
-import torch
-from deep_sort_realtime.deepsort_tracker import DeepSort
-from datetime import datetime
-class DeepSORTTracker:
-    def __init__(self, max_age=5):
-        """Initialize DeepSORT tracker."""
-        self.max_age = max_age
-        self.tracker = self._create_tracker()
-    def _create_tracker(self):
-        """Create a new instance of DeepSort tracker."""
-        return DeepSort(
-            max_age=self.max_age,
-            embedder='mobilenet',  # Using default MobileNetV2 embedder
-            today=datetime.now().date()  # For track naming and daily ID reset
-        )
-    def reset(self):
-        """Reset the tracker state by creating a new instance."""
-        print("Resetting DeepSORT tracker...")
-        self.tracker = self._create_tracker()
-    def update(self, frame, detections):
-        """Update tracking with new detections.
-        Args:
-            frame: Current video frame (numpy array)
-            detections: List of (box, keyword) tuples where box is [x1, y1, x2, y2] normalized
-        Returns:
-            List of (box, keyword, track_id) tuples
-        """
-        if not detections:
-            return []
-        height, width = frame.shape[:2]
-        # Convert normalized coordinates to absolute and format detections
-        detection_list = []
-        for box, keyword in detections:
-            x1 = int(box[0] * width)
-            y1 = int(box[1] * height)
-            x2 = int(box[2] * width)
-            y2 = int(box[3] * height)
-            w = x2 - x1
-            h = y2 - y1
-            # Format: ([left,top,w,h], confidence, detection_class)
-            detection_list.append(([x1, y1, w, h], 1.0, keyword))
-        # Update tracker
-        tracks = self.tracker.update_tracks(detection_list, frame=frame)
-        # Convert back to normalized coordinates with track IDs
-        tracked_objects = []
-        for track in tracks:
-            if not track.is_confirmed():
-                continue
-            ltrb = track.to_ltrb()  # Get [left,top,right,bottom] format
-            x1, y1, x2, y2 = ltrb
-            # Normalize coordinates
-            x1 = max(0.0, min(1.0, x1 / width))
-            y1 = max(0.0, min(1.0, y1 / height))
-            x2 = max(0.0, min(1.0, x2 / width))
-            y2 = max(0.0, min(1.0, y2 / height))
-            tracked_objects.append(([x1, y1, x2, y2], track.det_class, track.track_id))
         return tracked_objects

+import numpy as np
+import torch
+from deep_sort_realtime.deepsort_tracker import DeepSort
+from datetime import datetime
+class DeepSORTTracker:
+    def __init__(self, max_age=5):
+        """Initialize DeepSORT tracker."""
+        self.max_age = max_age
+        self.tracker = self._create_tracker()
+    def _create_tracker(self):
+        """Create a new instance of DeepSort tracker."""
+        return DeepSort(
+            max_age=self.max_age,
+            embedder='mobilenet',  # Using default MobileNetV2 embedder
+            today=datetime.now().date()  # For track naming and daily ID reset
+        )
+    def reset(self):
+        """Reset the tracker state by creating a new instance."""
+        print("Resetting DeepSORT tracker...")
+        self.tracker = self._create_tracker()
+    def update(self, frame, detections):
+        """Update tracking with new detections.
+        Args:
+            frame: Current video frame (numpy array)
+            detections: List of (box, keyword) tuples where box is [x1, y1, x2, y2] normalized
+        Returns:
+            List of (box, keyword, track_id) tuples
+        """
+        if not detections:
+            return []
+        height, width = frame.shape[:2]
+        # Convert normalized coordinates to absolute and format detections
+        detection_list = []
+        for box, keyword in detections:
+            x1 = int(box[0] * width)
+            y1 = int(box[1] * height)
+            x2 = int(box[2] * width)
+            y2 = int(box[3] * height)
+            w = x2 - x1
+            h = y2 - y1
+            # Format: ([left,top,w,h], confidence, detection_class)
+            detection_list.append(([x1, y1, w, h], 1.0, keyword))
+        # Update tracker
+        tracks = self.tracker.update_tracks(detection_list, frame=frame)
+        # Convert back to normalized coordinates with track IDs
+        tracked_objects = []
+        for track in tracks:
+            if not track.is_confirmed():
+                continue
+            ltrb = track.to_ltrb()  # Get [left,top,right,bottom] format
+            x1, y1, x2, y2 = ltrb
+            # Normalize coordinates
+            x1 = max(0.0, min(1.0, x1 / width))
+            y1 = max(0.0, min(1.0, y1 / height))
+            x2 = max(0.0, min(1.0, x2 / width))
+            y2 = max(0.0, min(1.0, y2 / height))
+            tracked_objects.append(([x1, y1, x2, y2], track.det_class, track.track_id))
         return tracked_objects

packages.txt CHANGED Viewed

	@@ -1,2 +1,2 @@
1	- libvips
2	ffmpeg


1	+ libvips
2	ffmpeg

persistence.py CHANGED Viewed

@@ -1,39 +1,39 @@
-import json
-import os
-def save_detection_data(data, output_file):
-    """
-    Saves the detection data to a JSON file.
-    Args:
-        data (dict): The complete detection data structure.
-        output_file (str): Path to the output JSON file.
-    """
-    try:
-        # Create directory if it doesn't exist
-        os.makedirs(os.path.dirname(output_file), exist_ok=True)
-        with open(output_file, "w") as f:
-            json.dump(data, f, indent=4)
-        print(f"Detection data saved to {output_file}")
-        return True
-    except Exception as e:
-        print(f"Error saving data: {str(e)}")
-        return False
-def load_detection_data(input_file):
-    """
-    Loads the detection data from a JSON file.
-    Args:
-        input_file (str): Path to the JSON file.
-    Returns:
-        dict: The loaded detection data, or None if there was an error.
-    """
-    try:
-        with open(input_file, "r") as f:
-            return json.load(f)
-    except Exception as e:
-        print(f"Error loading data: {str(e)}")
         return None

+import json
+import os
+def save_detection_data(data, output_file):
+    """
+    Saves the detection data to a JSON file.
+    Args:
+        data (dict): The complete detection data structure.
+        output_file (str): Path to the output JSON file.
+    """
+    try:
+        # Create directory if it doesn't exist
+        os.makedirs(os.path.dirname(output_file), exist_ok=True)
+        with open(output_file, "w") as f:
+            json.dump(data, f, indent=4)
+        print(f"Detection data saved to {output_file}")
+        return True
+    except Exception as e:
+        print(f"Error saving data: {str(e)}")
+        return False
+def load_detection_data(input_file):
+    """
+    Loads the detection data from a JSON file.
+    Args:
+        input_file (str): Path to the JSON file.
+    Returns:
+        dict: The loaded detection data, or None if there was an error.
+    """
+    try:
+        with open(input_file, "r") as f:
+            return json.load(f)
+    except Exception as e:
+        print(f"Error loading data: {str(e)}")
         return None

requirements.txt CHANGED Viewed

@@ -1,26 +1,26 @@
-gradio>=4.0.0
-torch>=2.0.0
-# if on windows: pip install torch==2.5.1+cu121 torchvision==0.20.1+cu121 --index-url https://download.pytorch.org/whl/cu121
-transformers>=4.36.0
-opencv-python>=4.8.0
-pillow>=10.0.0
-numpy>=1.24.0
-tqdm>=4.66.0
-ffmpeg-python
-einops
-pyvips-binary
-pyvips
-accelerate
-# for spaces
---extra-index-url https://download.pytorch.org/whl/cu113
-spaces
-# SAM dependencies
-torchvision>=0.20.1
-matplotlib>=3.7.0
-pandas>=2.0.0
-plotly
-# DeepSORT dependencies
-deep-sort-realtime>=1.3.2
-scikit-learn  # Required for deep-sort-realtime
-# Scene detection dependencies (for intelligent scene-aware tracking)
 scenedetect[opencv]>=0.6.2  # Provides scene change detection capabilities

+gradio>=4.0.0
+torch>=2.0.0
+# if on windows: pip install torch==2.5.1+cu121 torchvision==0.20.1+cu121 --index-url https://download.pytorch.org/whl/cu121
+transformers>=4.36.0
+opencv-python>=4.8.0
+pillow>=10.0.0
+numpy>=1.24.0
+tqdm>=4.66.0
+ffmpeg-python
+einops
+pyvips-binary
+pyvips
+accelerate
+# for spaces
+--extra-index-url https://download.pytorch.org/whl/cu113
+spaces
+# SAM dependencies
+torchvision>=0.20.1
+matplotlib>=3.7.0
+pandas>=2.0.0
+plotly
+# DeepSORT dependencies
+deep-sort-realtime>=1.3.2
+scikit-learn  # Required for deep-sort-realtime
+# Scene detection dependencies (for intelligent scene-aware tracking)
 scenedetect[opencv]>=0.6.2  # Provides scene change detection capabilities

video_visualization.py CHANGED Viewed

@@ -1,330 +1,330 @@
-import os
-import tempfile
-import subprocess
-import matplotlib.pyplot as plt
-import pandas as pd
-import cv2
-import numpy as np
-from tqdm import tqdm
-from persistence import load_detection_data
-def create_frame_data(json_path):
-    """Create frame-by-frame detection data for visualization."""
-    try:
-        data = load_detection_data(json_path)
-        if not data:
-            print("No data loaded from JSON file")
-            return None
-        if "video_metadata" not in data or "frame_detections" not in data:
-            print("Invalid JSON structure: missing required fields")
-            return None
-        # Extract video metadata
-        metadata = data["video_metadata"]
-        if "fps" not in metadata or "total_frames" not in metadata:
-            print("Invalid metadata: missing fps or total_frames")
-            return None
-        fps = metadata["fps"]
-        total_frames = metadata["total_frames"]
-        # Create frame data
-        frame_counts = {}
-        for frame_data in data["frame_detections"]:
-            if "frame" not in frame_data or "objects" not in frame_data:
-                continue  # Skip invalid frame data
-            frame_num = frame_data["frame"]
-            frame_counts[frame_num] = len(frame_data["objects"])
-        # Fill in missing frames with 0 detections
-        for frame in range(total_frames):
-            if frame not in frame_counts:
-                frame_counts[frame] = 0
-        if not frame_counts:
-            print("No valid frame data found")
-            return None
-        # Convert to DataFrame
-        df = pd.DataFrame(list(frame_counts.items()), columns=["frame", "detections"])
-        df["timestamp"] = df["frame"] / fps
-        return df, metadata
-    except Exception as e:
-        print(f"Error creating frame data: {str(e)}")
-        import traceback
-        traceback.print_exc()
-        return None
-def generate_frame_image(df, frame_num, temp_dir, max_y):
-    """Generate and save a single frame of the visualization."""
-    # Set the style to dark background
-    plt.style.use('dark_background')
-    # Set global font to monospace
-    plt.rcParams['font.family'] = 'monospace'
-    plt.rcParams['font.monospace'] = ['DejaVu Sans Mono']
-    plt.figure(figsize=(10, 6))
-    # Plot data up to current frame
-    current_data = df[df['frame'] <= frame_num]
-    plt.plot(df['frame'], df['detections'], color='#1a1a1a', alpha=0.5)  # Darker background line
-    plt.plot(current_data['frame'], current_data['detections'], color='#00ff41')  # Matrix green
-    # Add vertical line for current position
-    plt.axvline(x=frame_num, color='#ff0000', linestyle='-', alpha=0.7)  # Keep red for position
-    # Set consistent axes
-    plt.xlim(0, len(df) - 1)
-    plt.ylim(0, max_y * 1.1)  # Add 10% padding
-    # Add labels with Matrix green color
-    plt.title(f'FRAME {frame_num:04d} - DETECTIONS OVER TIME', color='#00ff41', pad=20)
-    plt.xlabel('FRAME NUMBER', color='#00ff41')
-    plt.ylabel('NUMBER OF DETECTIONS', color='#00ff41')
-    # Add current stats in Matrix green with monospace formatting
-    current_detections = df[df['frame'] == frame_num]['detections'].iloc[0]
-    plt.text(0.02, 0.98, f'CURRENT DETECTIONS: {current_detections:02d}',
-             transform=plt.gca().transAxes, verticalalignment='top',
-             color='#00ff41', family='monospace')
-    # Style the grid and ticks
-    plt.grid(True, color='#1a1a1a', linestyle='-', alpha=0.3)
-    plt.tick_params(colors='#00ff41')
-    # Save frame
-    frame_path = os.path.join(temp_dir, f'frame_{frame_num:05d}.png')
-    plt.savefig(frame_path, bbox_inches='tight', dpi=100, facecolor='black', edgecolor='none')
-    plt.close()
-    return frame_path
-def generate_gauge_frame(df, frame_num, temp_dir, detect_keyword="OBJECT"):
-    """Generate a modern square-style binary gauge visualization frame."""
-    # Set the style to dark background
-    plt.style.use('dark_background')
-    # Set global font to monospace
-    plt.rcParams['font.family'] = 'monospace'
-    plt.rcParams['font.monospace'] = ['DejaVu Sans Mono']
-    # Create figure with 16:9 aspect ratio
-    plt.figure(figsize=(16, 9))
-    # Get current detection state
-    current_detections = df[df['frame'] == frame_num]['detections'].iloc[0]
-    has_detection = current_detections > 0
-    # Create a simple gauge visualization
-    plt.axis('off')
-    # Set colors
-    if has_detection:
-        color = '#00ff41'  # Matrix green for YES
-        status = 'YES'
-        indicator_pos = 0.8  # Right position
-    else:
-        color = '#ff0000'  # Red for NO
-        status = 'NO'
-        indicator_pos = 0.2  # Left position
-    # Draw background rectangle
-    background = plt.Rectangle((0.1, 0.3), 0.8, 0.2,
-                             facecolor='#1a1a1a',
-                             edgecolor='#333333',
-                             linewidth=2)
-    plt.gca().add_patch(background)
-    # Draw indicator
-    indicator_width = 0.05
-    indicator = plt.Rectangle((indicator_pos - indicator_width/2, 0.25),
-                            indicator_width, 0.3,
-                            facecolor=color,
-                            edgecolor=None)
-    plt.gca().add_patch(indicator)
-    # Add tick marks
-    tick_positions = [0.2, 0.5, 0.8]  # NO, CENTER, YES
-    for x in tick_positions:
-        plt.plot([x, x], [0.3, 0.5], color='#444444', linewidth=2)
-    # Add YES/NO labels
-    plt.text(0.8, 0.2, 'YES', color='#00ff41', fontsize=14,
-             ha='center', va='center', family='monospace')
-    plt.text(0.2, 0.2, 'NO', color='#ff0000', fontsize=14,
-             ha='center', va='center', family='monospace')
-    # Add status box at top with detection keyword
-    plt.text(0.5, 0.8, f'{detect_keyword.upper()} DETECTED?', color=color,
-             fontsize=16, ha='center', va='center', family='monospace',
-             bbox=dict(facecolor='#1a1a1a',
-                      edgecolor=color,
-                      linewidth=2,
-                      pad=10))
-    # Add frame counter at bottom
-    plt.text(0.5, 0.1, f'FRAME: {frame_num:04d}', color='#00ff41',
-             fontsize=14, ha='center', va='center', family='monospace')
-    # Add subtle grid lines for depth
-    for x in np.linspace(0.2, 0.8, 7):
-        plt.plot([x, x], [0.3, 0.5], color='#222222', linewidth=1, zorder=0)
-    # Add glow effect to indicator
-    for i in range(3):
-        glow = plt.Rectangle((indicator_pos - (indicator_width + i*0.01)/2,
-                            0.25 - i*0.01),
-                            indicator_width + i*0.01,
-                            0.3 + i*0.02,
-                            facecolor=color,
-                            alpha=0.1/(i+1))
-        plt.gca().add_patch(glow)
-    # Set consistent plot limits
-    plt.xlim(0, 1)
-    plt.ylim(0, 1)
-    # Save frame with 16:9 aspect ratio
-    frame_path = os.path.join(temp_dir, f'gauge_{frame_num:05d}.png')
-    plt.savefig(frame_path,
-                bbox_inches='tight',
-                dpi=100,
-                facecolor='black',
-                edgecolor='none',
-                pad_inches=0)
-    plt.close()
-    return frame_path
-def create_video_visualization(json_path, style="timeline"):
-    """Create a video visualization of the detection data."""
-    try:
-        if not json_path:
-            return None, "No JSON file provided"
-        if not os.path.exists(json_path):
-            return None, f"File not found: {json_path}"
-        # Load and process data
-        result = create_frame_data(json_path)
-        if result is None:
-            return None, "Failed to load detection data from JSON file"
-        frame_data, metadata = result
-        if len(frame_data) == 0:
-            return None, "No frame data found in JSON file"
-        total_frames = metadata["total_frames"]
-        detect_keyword = metadata.get("detect_keyword", "OBJECT")  # Get the detection keyword
-        # Create temporary directory for frames
-        with tempfile.TemporaryDirectory() as temp_dir:
-            max_y = frame_data['detections'].max()
-            # Generate each frame
-            print("Generating frames...")
-            frame_paths = []
-            with tqdm(total=total_frames, desc="Generating frames") as pbar:
-                for frame in range(total_frames):
-                    try:
-                        if style == "gauge":
-                            frame_path = generate_gauge_frame(frame_data, frame, temp_dir, detect_keyword)
-                        else:  # default to timeline
-                            frame_path = generate_frame_image(frame_data, frame, temp_dir, max_y)
-                        if frame_path and os.path.exists(frame_path):
-                            frame_paths.append(frame_path)
-                        else:
-                            print(f"Warning: Failed to generate frame {frame}")
-                        pbar.update(1)
-                    except Exception as e:
-                        print(f"Error generating frame {frame}: {str(e)}")
-                        continue
-            if not frame_paths:
-                return None, "Failed to generate any frames"
-            # Create output video path
-            output_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "outputs")
-            os.makedirs(output_dir, exist_ok=True)
-            output_video = os.path.join(output_dir, f"detection_visualization_{style}.mp4")
-            # Create temp output path
-            base, ext = os.path.splitext(output_video)
-            temp_output = f"{base}_temp{ext}"
-            # First pass: Create video with OpenCV VideoWriter
-            print("Creating initial video...")
-            # Get frame size from first image
-            first_frame = cv2.imread(frame_paths[0])
-            height, width = first_frame.shape[:2]
-            out = cv2.VideoWriter(
-                temp_output,
-                cv2.VideoWriter_fourcc(*"mp4v"),
-                metadata["fps"],
-                (width, height)
-            )
-            with tqdm(total=total_frames, desc="Creating video") as pbar:  # Use total_frames here too
-                for frame_path in frame_paths:
-                    frame = cv2.imread(frame_path)
-                    out.write(frame)
-                    pbar.update(1)
-            out.release()
-            # Second pass: Convert to web-compatible format
-            print("Converting to web format...")
-            try:
-                subprocess.run(
-                    [
-                        "ffmpeg",
-                        "-y",
-                        "-i",
-                        temp_output,
-                        "-c:v",
-                        "libx264",
-                        "-preset",
-                        "medium",
-                        "-crf",
-                        "23",
-                        "-movflags",
-                        "+faststart",  # Better web playback
-                        "-loglevel",
-                        "error",
-                        output_video,
-                    ],
-                    check=True,
-                )
-                os.remove(temp_output)  # Remove the temporary file
-                if not os.path.exists(output_video):
-                    print(f"Warning: FFmpeg completed but output file not found at {output_video}")
-                    return None, "Failed to create video"
-                # Return video path and stats
-                stats = f"""Video Stats:
-FPS: {metadata['fps']}
-Total Frames: {metadata['total_frames']}
-Duration: {metadata['duration_sec']:.2f} seconds
-Max Detections in a Frame: {frame_data['detections'].max()}
-Average Detections per Frame: {frame_data['detections'].mean():.2f}"""
-                return output_video, stats
-            except subprocess.CalledProcessError as e:
-                print(f"Error running FFmpeg: {str(e)}")
-                if os.path.exists(temp_output):
-                    os.remove(temp_output)
-                return None, f"Error creating visualization: {str(e)}"
-    except Exception as e:
-        print(f"Error creating video visualization: {str(e)}")
-        import traceback
-        traceback.print_exc()
         return None, f"Error creating visualization: {str(e)}"

+import os
+import tempfile
+import subprocess
+import matplotlib.pyplot as plt
+import pandas as pd
+import cv2
+import numpy as np
+from tqdm import tqdm
+from persistence import load_detection_data
+def create_frame_data(json_path):
+    """Create frame-by-frame detection data for visualization."""
+    try:
+        data = load_detection_data(json_path)
+        if not data:
+            print("No data loaded from JSON file")
+            return None
+        if "video_metadata" not in data or "frame_detections" not in data:
+            print("Invalid JSON structure: missing required fields")
+            return None
+        # Extract video metadata
+        metadata = data["video_metadata"]
+        if "fps" not in metadata or "total_frames" not in metadata:
+            print("Invalid metadata: missing fps or total_frames")
+            return None
+        fps = metadata["fps"]
+        total_frames = metadata["total_frames"]
+        # Create frame data
+        frame_counts = {}
+        for frame_data in data["frame_detections"]:
+            if "frame" not in frame_data or "objects" not in frame_data:
+                continue  # Skip invalid frame data
+            frame_num = frame_data["frame"]
+            frame_counts[frame_num] = len(frame_data["objects"])
+        # Fill in missing frames with 0 detections
+        for frame in range(total_frames):
+            if frame not in frame_counts:
+                frame_counts[frame] = 0
+        if not frame_counts:
+            print("No valid frame data found")
+            return None
+        # Convert to DataFrame
+        df = pd.DataFrame(list(frame_counts.items()), columns=["frame", "detections"])
+        df["timestamp"] = df["frame"] / fps
+        return df, metadata
+    except Exception as e:
+        print(f"Error creating frame data: {str(e)}")
+        import traceback
+        traceback.print_exc()
+        return None
+def generate_frame_image(df, frame_num, temp_dir, max_y):
+    """Generate and save a single frame of the visualization."""
+    # Set the style to dark background
+    plt.style.use('dark_background')
+    # Set global font to monospace
+    plt.rcParams['font.family'] = 'monospace'
+    plt.rcParams['font.monospace'] = ['DejaVu Sans Mono']
+    plt.figure(figsize=(10, 6))
+    # Plot data up to current frame
+    current_data = df[df['frame'] <= frame_num]
+    plt.plot(df['frame'], df['detections'], color='#1a1a1a', alpha=0.5)  # Darker background line
+    plt.plot(current_data['frame'], current_data['detections'], color='#00ff41')  # Matrix green
+    # Add vertical line for current position
+    plt.axvline(x=frame_num, color='#ff0000', linestyle='-', alpha=0.7)  # Keep red for position
+    # Set consistent axes
+    plt.xlim(0, len(df) - 1)
+    plt.ylim(0, max_y * 1.1)  # Add 10% padding
+    # Add labels with Matrix green color
+    plt.title(f'FRAME {frame_num:04d} - DETECTIONS OVER TIME', color='#00ff41', pad=20)
+    plt.xlabel('FRAME NUMBER', color='#00ff41')
+    plt.ylabel('NUMBER OF DETECTIONS', color='#00ff41')
+    # Add current stats in Matrix green with monospace formatting
+    current_detections = df[df['frame'] == frame_num]['detections'].iloc[0]
+    plt.text(0.02, 0.98, f'CURRENT DETECTIONS: {current_detections:02d}',
+             transform=plt.gca().transAxes, verticalalignment='top',
+             color='#00ff41', family='monospace')
+    # Style the grid and ticks
+    plt.grid(True, color='#1a1a1a', linestyle='-', alpha=0.3)
+    plt.tick_params(colors='#00ff41')
+    # Save frame
+    frame_path = os.path.join(temp_dir, f'frame_{frame_num:05d}.png')
+    plt.savefig(frame_path, bbox_inches='tight', dpi=100, facecolor='black', edgecolor='none')
+    plt.close()
+    return frame_path
+def generate_gauge_frame(df, frame_num, temp_dir, detect_keyword="OBJECT"):
+    """Generate a modern square-style binary gauge visualization frame."""
+    # Set the style to dark background
+    plt.style.use('dark_background')
+    # Set global font to monospace
+    plt.rcParams['font.family'] = 'monospace'
+    plt.rcParams['font.monospace'] = ['DejaVu Sans Mono']
+    # Create figure with 16:9 aspect ratio
+    plt.figure(figsize=(16, 9))
+    # Get current detection state
+    current_detections = df[df['frame'] == frame_num]['detections'].iloc[0]
+    has_detection = current_detections > 0
+    # Create a simple gauge visualization
+    plt.axis('off')
+    # Set colors
+    if has_detection:
+        color = '#00ff41'  # Matrix green for YES
+        status = 'YES'
+        indicator_pos = 0.8  # Right position
+    else:
+        color = '#ff0000'  # Red for NO
+        status = 'NO'
+        indicator_pos = 0.2  # Left position
+    # Draw background rectangle
+    background = plt.Rectangle((0.1, 0.3), 0.8, 0.2,
+                             facecolor='#1a1a1a',
+                             edgecolor='#333333',
+                             linewidth=2)
+    plt.gca().add_patch(background)
+    # Draw indicator
+    indicator_width = 0.05
+    indicator = plt.Rectangle((indicator_pos - indicator_width/2, 0.25),
+                            indicator_width, 0.3,
+                            facecolor=color,
+                            edgecolor=None)
+    plt.gca().add_patch(indicator)
+    # Add tick marks
+    tick_positions = [0.2, 0.5, 0.8]  # NO, CENTER, YES
+    for x in tick_positions:
+        plt.plot([x, x], [0.3, 0.5], color='#444444', linewidth=2)
+    # Add YES/NO labels
+    plt.text(0.8, 0.2, 'YES', color='#00ff41', fontsize=14,
+             ha='center', va='center', family='monospace')
+    plt.text(0.2, 0.2, 'NO', color='#ff0000', fontsize=14,
+             ha='center', va='center', family='monospace')
+    # Add status box at top with detection keyword
+    plt.text(0.5, 0.8, f'{detect_keyword.upper()} DETECTED?', color=color,
+             fontsize=16, ha='center', va='center', family='monospace',
+             bbox=dict(facecolor='#1a1a1a',
+                      edgecolor=color,
+                      linewidth=2,
+                      pad=10))
+    # Add frame counter at bottom
+    plt.text(0.5, 0.1, f'FRAME: {frame_num:04d}', color='#00ff41',
+             fontsize=14, ha='center', va='center', family='monospace')
+    # Add subtle grid lines for depth
+    for x in np.linspace(0.2, 0.8, 7):
+        plt.plot([x, x], [0.3, 0.5], color='#222222', linewidth=1, zorder=0)
+    # Add glow effect to indicator
+    for i in range(3):
+        glow = plt.Rectangle((indicator_pos - (indicator_width + i*0.01)/2,
+                            0.25 - i*0.01),
+                            indicator_width + i*0.01,
+                            0.3 + i*0.02,
+                            facecolor=color,
+                            alpha=0.1/(i+1))
+        plt.gca().add_patch(glow)
+    # Set consistent plot limits
+    plt.xlim(0, 1)
+    plt.ylim(0, 1)
+    # Save frame with 16:9 aspect ratio
+    frame_path = os.path.join(temp_dir, f'gauge_{frame_num:05d}.png')
+    plt.savefig(frame_path,
+                bbox_inches='tight',
+                dpi=100,
+                facecolor='black',
+                edgecolor='none',
+                pad_inches=0)
+    plt.close()
+    return frame_path
+def create_video_visualization(json_path, style="timeline"):
+    """Create a video visualization of the detection data."""
+    try:
+        if not json_path:
+            return None, "No JSON file provided"
+        if not os.path.exists(json_path):
+            return None, f"File not found: {json_path}"
+        # Load and process data
+        result = create_frame_data(json_path)
+        if result is None:
+            return None, "Failed to load detection data from JSON file"
+        frame_data, metadata = result
+        if len(frame_data) == 0:
+            return None, "No frame data found in JSON file"
+        total_frames = metadata["total_frames"]
+        detect_keyword = metadata.get("detect_keyword", "OBJECT")  # Get the detection keyword
+        # Create temporary directory for frames
+        with tempfile.TemporaryDirectory() as temp_dir:
+            max_y = frame_data['detections'].max()
+            # Generate each frame
+            print("Generating frames...")
+            frame_paths = []
+            with tqdm(total=total_frames, desc="Generating frames") as pbar:
+                for frame in range(total_frames):
+                    try:
+                        if style == "gauge":
+                            frame_path = generate_gauge_frame(frame_data, frame, temp_dir, detect_keyword)
+                        else:  # default to timeline
+                            frame_path = generate_frame_image(frame_data, frame, temp_dir, max_y)
+                        if frame_path and os.path.exists(frame_path):
+                            frame_paths.append(frame_path)
+                        else:
+                            print(f"Warning: Failed to generate frame {frame}")
+                        pbar.update(1)
+                    except Exception as e:
+                        print(f"Error generating frame {frame}: {str(e)}")
+                        continue
+            if not frame_paths:
+                return None, "Failed to generate any frames"
+            # Create output video path
+            output_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "outputs")
+            os.makedirs(output_dir, exist_ok=True)
+            output_video = os.path.join(output_dir, f"detection_visualization_{style}.mp4")
+            # Create temp output path
+            base, ext = os.path.splitext(output_video)
+            temp_output = f"{base}_temp{ext}"
+            # First pass: Create video with OpenCV VideoWriter
+            print("Creating initial video...")
+            # Get frame size from first image
+            first_frame = cv2.imread(frame_paths[0])
+            height, width = first_frame.shape[:2]
+            out = cv2.VideoWriter(
+                temp_output,
+                cv2.VideoWriter_fourcc(*"mp4v"),
+                metadata["fps"],
+                (width, height)
+            )
+            with tqdm(total=total_frames, desc="Creating video") as pbar:  # Use total_frames here too
+                for frame_path in frame_paths:
+                    frame = cv2.imread(frame_path)
+                    out.write(frame)
+                    pbar.update(1)
+            out.release()
+            # Second pass: Convert to web-compatible format
+            print("Converting to web format...")
+            try:
+                subprocess.run(
+                    [
+                        "ffmpeg",
+                        "-y",
+                        "-i",
+                        temp_output,
+                        "-c:v",
+                        "libx264",
+                        "-preset",
+                        "medium",
+                        "-crf",
+                        "23",
+                        "-movflags",
+                        "+faststart",  # Better web playback
+                        "-loglevel",
+                        "error",
+                        output_video,
+                    ],
+                    check=True,
+                )
+                os.remove(temp_output)  # Remove the temporary file
+                if not os.path.exists(output_video):
+                    print(f"Warning: FFmpeg completed but output file not found at {output_video}")
+                    return None, "Failed to create video"
+                # Return video path and stats
+                stats = f"""Video Stats:
+FPS: {metadata['fps']}
+Total Frames: {metadata['total_frames']}
+Duration: {metadata['duration_sec']:.2f} seconds
+Max Detections in a Frame: {frame_data['detections'].max()}
+Average Detections per Frame: {frame_data['detections'].mean():.2f}"""
+                return output_video, stats
+            except subprocess.CalledProcessError as e:
+                print(f"Error running FFmpeg: {str(e)}")
+                if os.path.exists(temp_output):
+                    os.remove(temp_output)
+                return None, f"Error creating visualization: {str(e)}"
+    except Exception as e:
+        print(f"Error creating video visualization: {str(e)}")
+        import traceback
+        traceback.print_exc()
         return None, f"Error creating visualization: {str(e)}"

visualization.py CHANGED Viewed

@@ -1,98 +1,98 @@
-import pandas as pd
-import matplotlib.pyplot as plt
-from persistence import load_detection_data
-import argparse
-def visualize_detections(json_path):
-    """
-    Visualize detection data from a JSON file.
-    Args:
-        json_path (str): Path to the JSON file containing detection data.
-    """
-    # Load the persisted JSON data
-    data = load_detection_data(json_path)
-    if not data:
-        return
-    # Convert the frame detections to a DataFrame
-    rows = []
-    for frame_data in data["frame_detections"]:
-        frame = frame_data["frame"]
-        timestamp = frame_data["timestamp"]
-        for obj in frame_data["objects"]:
-            rows.append({
-                "frame": frame,
-                "timestamp": timestamp,
-                "keyword": obj["keyword"],
-                "x1": obj["bbox"][0],
-                "y1": obj["bbox"][1],
-                "x2": obj["bbox"][2],
-                "y2": obj["bbox"][3],
-                "area": (obj["bbox"][2] - obj["bbox"][0]) * (obj["bbox"][3] - obj["bbox"][1])
-            })
-    if not rows:
-        print("No detections found in the data")
-        return
-    df = pd.DataFrame(rows)
-    # Create a figure with multiple subplots
-    fig = plt.figure(figsize=(15, 10))
-    # Plot 1: Number of detections per frame
-    plt.subplot(2, 2, 1)
-    detections_per_frame = df.groupby("frame").size()
-    plt.plot(detections_per_frame.index, detections_per_frame.values)
-    plt.xlabel("Frame")
-    plt.ylabel("Number of Detections")
-    plt.title("Detections Per Frame")
-    # Plot 2: Distribution of detection areas
-    plt.subplot(2, 2, 2)
-    df["area"].hist(bins=30)
-    plt.xlabel("Detection Area (normalized)")
-    plt.ylabel("Count")
-    plt.title("Distribution of Detection Areas")
-    # Plot 3: Average detection area over time
-    plt.subplot(2, 2, 3)
-    avg_area = df.groupby("frame")["area"].mean()
-    plt.plot(avg_area.index, avg_area.values)
-    plt.xlabel("Frame")
-    plt.ylabel("Average Detection Area")
-    plt.title("Average Detection Area Over Time")
-    # Plot 4: Heatmap of detection centers
-    plt.subplot(2, 2, 4)
-    df["center_x"] = (df["x1"] + df["x2"]) / 2
-    df["center_y"] = (df["y1"] + df["y2"]) / 2
-    plt.hist2d(df["center_x"], df["center_y"], bins=30)
-    plt.colorbar()
-    plt.xlabel("X Position")
-    plt.ylabel("Y Position")
-    plt.title("Detection Center Heatmap")
-    # Adjust layout and display
-    plt.tight_layout()
-    plt.show()
-    # Print summary statistics
-    print("\nSummary Statistics:")
-    print(f"Total frames analyzed: {len(data['frame_detections'])}")
-    print(f"Total detections: {len(df)}")
-    print(f"Average detections per frame: {len(df) / len(data['frame_detections']):.2f}")
-    print(f"\nVideo metadata:")
-    for key, value in data["video_metadata"].items():
-        print(f"{key}: {value}")
-def main():
-    parser = argparse.ArgumentParser(description="Visualize object detection data")
-    parser.add_argument("json_file", help="Path to the JSON file containing detection data")
-    args = parser.parse_args()
-    visualize_detections(args.json_file)
-if __name__ == "__main__":
     main()

+import pandas as pd
+import matplotlib.pyplot as plt
+from persistence import load_detection_data
+import argparse
+def visualize_detections(json_path):
+    """
+    Visualize detection data from a JSON file.
+    Args:
+        json_path (str): Path to the JSON file containing detection data.
+    """
+    # Load the persisted JSON data
+    data = load_detection_data(json_path)
+    if not data:
+        return
+    # Convert the frame detections to a DataFrame
+    rows = []
+    for frame_data in data["frame_detections"]:
+        frame = frame_data["frame"]
+        timestamp = frame_data["timestamp"]
+        for obj in frame_data["objects"]:
+            rows.append({
+                "frame": frame,
+                "timestamp": timestamp,
+                "keyword": obj["keyword"],
+                "x1": obj["bbox"][0],
+                "y1": obj["bbox"][1],
+                "x2": obj["bbox"][2],
+                "y2": obj["bbox"][3],
+                "area": (obj["bbox"][2] - obj["bbox"][0]) * (obj["bbox"][3] - obj["bbox"][1])
+            })
+    if not rows:
+        print("No detections found in the data")
+        return
+    df = pd.DataFrame(rows)
+    # Create a figure with multiple subplots
+    fig = plt.figure(figsize=(15, 10))
+    # Plot 1: Number of detections per frame
+    plt.subplot(2, 2, 1)
+    detections_per_frame = df.groupby("frame").size()
+    plt.plot(detections_per_frame.index, detections_per_frame.values)
+    plt.xlabel("Frame")
+    plt.ylabel("Number of Detections")
+    plt.title("Detections Per Frame")
+    # Plot 2: Distribution of detection areas
+    plt.subplot(2, 2, 2)
+    df["area"].hist(bins=30)
+    plt.xlabel("Detection Area (normalized)")
+    plt.ylabel("Count")
+    plt.title("Distribution of Detection Areas")
+    # Plot 3: Average detection area over time
+    plt.subplot(2, 2, 3)
+    avg_area = df.groupby("frame")["area"].mean()
+    plt.plot(avg_area.index, avg_area.values)
+    plt.xlabel("Frame")
+    plt.ylabel("Average Detection Area")
+    plt.title("Average Detection Area Over Time")
+    # Plot 4: Heatmap of detection centers
+    plt.subplot(2, 2, 4)
+    df["center_x"] = (df["x1"] + df["x2"]) / 2
+    df["center_y"] = (df["y1"] + df["y2"]) / 2
+    plt.hist2d(df["center_x"], df["center_y"], bins=30)
+    plt.colorbar()
+    plt.xlabel("X Position")
+    plt.ylabel("Y Position")
+    plt.title("Detection Center Heatmap")
+    # Adjust layout and display
+    plt.tight_layout()
+    plt.show()
+    # Print summary statistics
+    print("\nSummary Statistics:")
+    print(f"Total frames analyzed: {len(data['frame_detections'])}")
+    print(f"Total detections: {len(df)}")
+    print(f"Average detections per frame: {len(df) / len(data['frame_detections']):.2f}")
+    print(f"\nVideo metadata:")
+    for key, value in data["video_metadata"].items():
+        print(f"{key}: {value}")
+def main():
+    parser = argparse.ArgumentParser(description="Visualize object detection data")
+    parser.add_argument("json_file", help="Path to the JSON file containing detection data")
+    args = parser.parse_args()
+    visualize_detections(args.json_file)
+if __name__ == "__main__":
     main()