Spaces:

Taino
/

DynamicVemesv2

Sleeping

App Files Files Community

Taino commited on 27 days ago

Commit

f5aec45

verified ·

1 Parent(s): 588a03f

Upload 6 files

Browse files

Files changed (6) hide show

.gitignore +4 -0
ReadMe.md +221 -0
app.py +371 -0
detection.pt +3 -0
detection.py +215 -0
requirements.txt +14 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,4 @@

+.gradio
+*.mp4
+*.json
+*.log

ReadMe.md ADDED Viewed

	@@ -0,0 +1,221 @@

+# 🎥 Video Person Detection & Tracking with ReID
+A sophisticated computer vision application that combines YOLOv8, InsightFace, and TorchReID for robust person detection, tracking, and re-identification in videos. The application provides a user-friendly Gradio interface for easy video processing.
+## 🔧 Technology Stack
+- **YOLOv8**: Real-time person detection
+- **ByteTrack**: Multi-object tracking algorithm
+- **InsightFace**: Facial feature extraction for person identification
+- **OSNet**: Full-body re-identification features
+- **Gradio**: Web-based user interface
+## 📋 Features
+- Real-time person detection and tracking
+- Consistent person re-identification across frames
+- Face and body feature extraction
+- Interactive web interface
+- JSON export of tracking data
+- Support for multiple video formats
+## 🚀 Quick Start
+### Prerequisites
+**System Requirements:**
+- Python 3.8 or higher
+- CUDA-compatible GPU (recommended for better performance)
+- At least 4GB RAM
+- 2GB free disk space
+**Platform-Specific Dependencies:**
+**Linux:**
+```bash
+# Install g++ compiler (required for InsightFace)
+sudo apt-get update
+sudo apt-get install g++ build-essential
+```
+**Windows:**
+- Install [Microsoft Visual C++ Redistributable](https://aka.ms/vs/17/release/vc_redist.x64.exe) (latest version)
+- Ensure you have Visual Studio Build Tools or Visual Studio Community installed
+**macOS:**
+```bash
+# Install Xcode command line tools
+xcode-select --install
+```
+### Installation
+1. **Clone the repository:**
+```bash
+git clone [email protected]:zebshah7851/object-detection-and-tracking.git
+cd video-person-tracking
+```
+2. **Create a virtual environment:**
+```bash
+python -m venv venv
+# Activate virtual environment
+# On Windows:
+venv\Scripts\activate
+# On Linux/macOS:
+source venv/bin/activate
+```
+3. **Install dependencies:**
+```bash
+pip install --upgrade pip
+pip install -r requirements.txt
+```
+**Note:** The installation process may take 10-15 minutes due to large model downloads (PyTorch, CUDA libraries, etc.).
+### Model Setup
+The application requires several pre-trained models:
+1. **YOLOv8 Detection Model:**
+   - Place your trained `detection.pt` model file in the project root directory
+   - Alternatively, the app will download a default YOLOv8 model on first run
+2. **InsightFace Model:**
+   - The `buffalo_l` model will be automatically downloaded on first run
+   - Requires ~2GB of storage space
+3. **TorchReID Model:**
+   - The `osnet_x0_25` model will be automatically downloaded
+   - Pre-trained on Market1501 dataset
+### Running the Application
+1. **Start the Gradio interface:**
+```bash
+python app.py
+```
+2. **Access the web interface:**
+   - Open your browser and navigate to: `http://127.0.0.1:7860`
+   - The interface will load automatically
+3. **Process videos:**
+   - Upload a video file (MP4, AVI, MOV, WEBM)
+   - Click "🚀 Process Video"
+   - Download the processed video and tracking data
+## 📁 Project Structure
+```
+video-person-tracking/
+├── app.py                 # Gradio web interface
+├── detection.py           # Core detection script
+├── requirements.txt       # Python dependencies
+├── README.md              # This file
+├── outputs/               # Generated output files
+├── detection.pt           # YOLOv8 model to detect persons
+└── logs/                  # Application logs
+```
+## 🔧 Configuration
+### Model Parameters
+You can adjust the following parameters in `app.py`:
+```python
+DETECTION_THRESHOLD = 0.75  # Person detection confidence threshold
+SIMILARITY_THRESHOLD = 0.6  # Person re-identification threshold
+```
+### Performance Optimization
+**For GPU acceleration:**
+- Ensure CUDA is properly installed
+- The application automatically detects and uses GPU if available
+- Monitor GPU memory usage for large videos
+**For CPU-only systems:**
+- Reduce video resolution before processing
+- Process shorter video segments
+- Expect longer processing times
+## 📊 Output Format
+### Processed Video
+- Annotated video with bounding boxes
+- Consistent person IDs across frames
+- Real-time tracking visualization
+### JSON Tracking Data
+```json
+{
+  "metadata": {
+    "total_frames": 1500,
+    "total_people": 5,
+    "id_mapping": {...}
+  },
+  "frames": [
+    {
+      "frame": 1,
+      "people": [
+        {
+          "person_id": 1,
+          "center_x": 320.5,
+          "center_y": 240.0,
+          "confidence": 0.85,
+          "bbox": {"x1": 100, "y1": 50, "x2": 200, "y2": 300}
+        }
+      ]
+    }
+  ]
+}
+```
+## 🐛 Troubleshooting
+### Common Issues
+**Installation Problems:**
+1. **InsightFace installation fails:**
+   ```bash
+   # Try installing with specific version
+   pip install insightface==0.7.3
+   pip install onnxruntime-gpu==1.14.1
+   ```
+   If you running linux, you need to install g++. If running on windows, you will need to install latest Visual C++ Redistributions.
+2. **Model download issues:**
+   - Check internet connection
+   - Manually download models if automatic download fails
+   - Ensure sufficient disk space
+**Runtime Issues:**
+1. **Video won't load in browser:**
+   - Try downloading the output video manually
+   - Check browser compatibility
+   - Clear browser cache
+2. **Slow processing:**
+   - Use GPU acceleration if available
+   - Reduce detection threshold
+   - Process shorter video segments
+3. **High memory usage:**
+   - Monitor system resources
+   - Close unnecessary applications
+   - Use smaller input videos
+## 📝 System Requirements
+- **CPU:** Intel i5 or AMD Ryzen 5 (4 cores)
+- **RAM:** 8GB
+- **Storage:** 5GB free space
+- **GPU:** Optional, but recommended for faster processing

app.py ADDED Viewed

	@@ -0,0 +1,371 @@

+import warnings
+warnings.filterwarnings("ignore")
+import gradio as gr
+import cv2
+import numpy as np
+import json
+import os
+from datetime import datetime
+from ultralytics import YOLO
+from insightface.app import FaceAnalysis
+import torchreid
+import torch
+import logging
+import shutil
+import tempfile
+import uuid
+# ========== Logging Configuration ==========
+logging.basicConfig(
+    level=logging.INFO,
+    format='[%(asctime)s] [%(levelname)s] %(message)s',
+    handlers=[
+        logging.FileHandler("app.log"),
+        logging.StreamHandler()
+    ]
+)
+logger = logging.getLogger(__name__)
+# ========== Configuration ==========
+DETECTION_THRESHOLD = 0.75
+# Create output directory for Gradio
+OUTPUT_DIR = os.path.join(os.getcwd(), "outputs")
+os.makedirs(OUTPUT_DIR, exist_ok=True)
+# ========== Video Processing Class ==========
+class VideoProcessor:
+    def __init__(self):
+        try:
+            self.model = YOLO('detection.pt')
+            self.face_app = FaceAnalysis(name='buffalo_l', providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
+            self.face_app.prepare(ctx_id=0)
+            self.reid_extractor = torchreid.utils.FeatureExtractor(
+                model_name='osnet_x0_25',
+                model_path=None,
+                device='cuda' if torch.cuda.is_available() else 'cpu'
+            )
+            self.models_loaded = True
+            logger.info("Models loaded successfully.")
+        except Exception as e:
+            logger.exception("Model loading failed.")
+            self.models_loaded = False
+        self.reset_tracking()
+    def reset_tracking(self):
+        self.known_embeddings = []
+        self.known_ids = []
+        self.next_global_id = 1
+        self.track_to_global = {}
+        self.tracking_data = {
+            "metadata": {
+                "total_frames": 0,
+                "total_people": 0,
+                "id_mapping": {}
+            },
+            "frames": []
+        }
+        logger.info("Tracking state reset.")
+    def extract_embeddings(self, person_crop):
+        face_embedding, body_embedding = None, None
+        try:
+            faces = self.face_app.get(person_crop)
+            if faces:
+                face_embedding = faces[0].embedding
+        except Exception:
+            logger.debug("Face embedding failed.")
+        try:
+            body_input = cv2.resize(person_crop, (128, 256))
+            body_input = cv2.cvtColor(body_input, cv2.COLOR_BGR2RGB)
+            body_embedding = self.reid_extractor(body_input)[0].cpu().numpy()
+        except Exception:
+            logger.debug("Body embedding failed.")
+        if face_embedding is not None and body_embedding is not None:
+            return np.concatenate((face_embedding, body_embedding)).astype(np.float32)
+        elif face_embedding is not None:
+            return face_embedding.astype(np.float32)
+        elif body_embedding is not None:
+            return body_embedding.astype(np.float32)
+        return None
+    def assign_global_id(self, embedding, track_id):
+        if embedding is None:
+            return self.track_to_global.get(track_id, f"T{track_id}")
+        match_found = False
+        if self.known_embeddings:
+            matching_embeddings = [
+                (emb, gid) for emb, gid in zip(self.known_embeddings, self.known_ids)
+                if emb.shape[0] == embedding.shape[0]
+            ]
+            if matching_embeddings:
+                embs, gids = zip(*matching_embeddings)
+                embs = np.array(embs)
+                sims = np.dot(embs, embedding) / (
+                    np.linalg.norm(embs, axis=1) * np.linalg.norm(embedding) + 1e-6
+                )
+                best_match = np.argmax(sims)
+                if sims[best_match] > 0.6:
+                    global_id = gids[best_match]
+                    match_found = True
+        if not match_found:
+            global_id = self.next_global_id
+            self.next_global_id += 1
+            self.known_embeddings.append(embedding)
+            self.known_ids.append(global_id)
+        if track_id is not None:
+            self.track_to_global[track_id] = global_id
+        return global_id
+    def process_video(self, input_video_path, progress_callback=None):
+        if not self.models_loaded:
+            raise Exception("Models not loaded properly")
+        self.reset_tracking()
+        # Create output files with timestamp
+        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
+        unique_id = str(uuid.uuid4())[:8]
+        # Use the OUTPUT_DIR instead of temp directory
+        output_video_path = os.path.join(OUTPUT_DIR, f"tracked_video_{timestamp}_{unique_id}.mp4")
+        output_json_path = os.path.join(OUTPUT_DIR, f"tracking_data_{timestamp}_{unique_id}.json")
+        cap = cv2.VideoCapture(input_video_path)
+        if not cap.isOpened():
+            raise Exception("Could not open video file")
+        width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
+        height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
+        fps = cap.get(cv2.CAP_PROP_FPS)
+        total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+        # Use H.264 codec for better compatibility and add proper video codec
+        fourcc = cv2.VideoWriter_fourcc(*'mp4v')  # Changed from 'mp4v' to 'H264'
+        out = cv2.VideoWriter(output_video_path, fourcc, fps, (width, height))
+        # Verify video writer is properly initialized
+        if not out.isOpened():
+            logger.warning("H264 codec failed, trying XVID")
+            fourcc = cv2.VideoWriter_fourcc(*'XVID')
+            output_video_path = output_video_path.replace('.mp4', '.avi')
+            out = cv2.VideoWriter(output_video_path, fourcc, fps, (width, height))
+        if not out.isOpened():
+            logger.warning("XVID codec failed, trying mp4v")
+            fourcc = cv2.VideoWriter_fourcc(*'H264')
+            output_video_path = output_video_path.replace('.avi', '.mp4')
+            out = cv2.VideoWriter(output_video_path, fourcc, fps, (width, height))
+        frame_count = 0
+        while True:
+            ret, frame = cap.read()
+            if not ret:
+                break
+            frame_count += 1
+            if progress_callback:
+                progress_callback(frame_count / total_frames, f"Processing frame {frame_count}/{total_frames}")
+            frame_data = {"frame": frame_count, "people": []}
+            try:
+                results = self.model.track(
+                    frame, tracker="bytetrack.yaml", persist=True, verbose=False, conf=DETECTION_THRESHOLD
+                )
+                for result in results:
+                    if result.boxes is not None:
+                        boxes = result.boxes.xyxy.cpu().numpy()
+                        confidences = result.boxes.conf.cpu().numpy()
+                        track_ids = result.boxes.id.int().cpu().tolist() if result.boxes.id is not None else [None] * len(boxes)
+                        for box, conf, track_id in zip(boxes, confidences, track_ids):
+                            x1, y1, x2, y2 = map(int, box)
+                            person_crop = frame[y1:y2, x1:x2]
+                            if person_crop.size > 0:
+                                embedding = self.extract_embeddings(person_crop)
+                                global_id = self.assign_global_id(embedding, track_id)
+                                frame_data["people"].append({
+                                    "person_id": global_id,
+                                    "center_x": (x1 + x2) / 2,
+                                    "center_y": (y1 + y2) / 2,
+                                    "confidence": float(conf),
+                                    "bbox": {"x1": float(x1), "y1": float(y1), "x2": float(x2), "y2": float(y2)}
+                                })
+                                cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
+                                cv2.putText(frame, f"ID {global_id}", (x1, y1 - 10),
+                                            cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)
+            except Exception as e:
+                logger.exception(f"Error processing frame {frame_count}.")
+            self.tracking_data["frames"].append(frame_data)
+            out.write(frame)
+        cap.release()
+        out.release()
+        # Verify the output file was created and has content
+        if not os.path.exists(output_video_path) or os.path.getsize(output_video_path) == 0:
+            raise Exception("Output video file was not created properly")
+        self.tracking_data["metadata"]["total_frames"] = frame_count
+        self.tracking_data["metadata"]["total_people"] = len(set(self.known_ids))
+        self.tracking_data["metadata"]["id_mapping"] = {str(k): v for k, v in self.track_to_global.items()}
+        # Save JSON file
+        with open(output_json_path, 'w') as f:
+            json.dump(self.tracking_data, f, indent=2)
+        logger.info(f"Video processing completed. Saved to {output_video_path}")
+        logger.info(f"Video file size: {os.path.getsize(output_video_path)} bytes")
+        return output_video_path, output_json_path
+# ========== Processor ==========
+processor = VideoProcessor()
+# ========== Gradio Handler ==========
+def process_video_gradio(input_video, progress=gr.Progress()):
+    if input_video is None:
+        return None, None, "Please upload a video file."
+    try:
+        def progress_callback(prog, message):
+            progress(prog, desc=message)
+        # Process video
+        output_video_path, output_json_path = processor.process_video(input_video, progress_callback)
+        # Verify files exist and are accessible
+        if not os.path.exists(output_video_path):
+            raise Exception(f"Output video not found at {output_video_path}")
+        if not os.path.exists(output_json_path):
+            raise Exception(f"Output JSON not found at {output_json_path}")
+        # Read tracking data for stats
+        with open(output_json_path, 'r') as f:
+            data = json.load(f)
+        stats = f"""
+        **Processing Complete!** ✅
+        - **Total Frames Processed:** {data['metadata']['total_frames']}
+        - **Total People Detected:** {data['metadata']['total_people']}
+        - **Unique IDs Assigned:** {len(data['metadata']['id_mapping'])}
+        - **Output Video Size:** {os.path.getsize(output_video_path) / (1024*1024):.1f} MB
+        📹 **Output video** is ready for download
+        📄 **JSON tracking data** contains frame-by-frame detection results
+        """
+        logger.info(f"Returning video path: {output_video_path}")
+        logger.info(f"Video exists: {os.path.exists(output_video_path)}")
+        return output_video_path, output_json_path, stats
+    except Exception as e:
+        logger.exception("Video processing failed.")
+        return None, None, f"❌ **Error processing video:** {str(e)}"
+# ========== Gradio Interface ==========
+def create_interface():
+    with gr.Blocks(title="Video Person Detection & Tracking", theme=gr.themes.Soft()) as demo:
+        gr.Markdown("# 🎥 Video Person Detection & Tracking with ReID")
+        gr.Markdown("Upload a video to detect and track people using YOLOv8, InsightFace, and ReID models for consistent person identification across frames.")
+        with gr.Row():
+            with gr.Column(scale=1):
+                input_video = gr.Video(
+                    label="📂 Upload Input Video",
+                    height=400,
+                    interactive=True
+                )
+                process_btn = gr.Button(
+                    "🚀 Process Video",
+                    variant="primary",
+                    size="lg"
+                )
+            with gr.Column(scale=1):
+                output_video = gr.Video(
+                    label="🎬 Processed Video (with tracking)",
+                    height=400,
+                    interactive=False,
+                    show_download_button=True  # Enable download button
+                )
+                download_json = gr.File(
+                    label="📊 Download Tracking Data (JSON)",
+                    interactive=False
+                )
+        with gr.Row():
+            status_text = gr.Markdown("📤 Upload a video and click **'Process Video'** to start tracking people.")
+        # Event handler
+        process_btn.click(
+            fn=process_video_gradio,
+            inputs=[input_video],
+            outputs=[output_video, download_json, status_text],
+            show_progress=True
+        )
+        # Additional information
+        with gr.Accordion("📖 How it works", open=False):
+            gr.Markdown("""
+            ### 🔧 **Technology Stack:**
+            - **YOLOv8:** Real-time person detection
+            - **ByteTrack:** Multi-object tracking algorithm
+            - **InsightFace:** Facial feature extraction for person identification
+            - **OSNet:** Full-body re-identification features
+            ### 📋 **Process:**
+            1. **Detection:** YOLOv8 detects people in each frame
+            2. **Tracking:** ByteTrack assigns temporary tracking IDs
+            3. **Feature Extraction:** InsightFace + OSNet extract identifying features
+            4. **Re-identification:** Combines face and body features for consistent global IDs
+            5. **Output:** Generates annotated video + detailed JSON tracking data
+            ### 📁 **Supported Formats:**
+            - **Input:** MP4, AVI, MOV, WEBM
+            - **Output:** MP4 video + JSON metadata
+            """)
+        with gr.Accordion("⚙️ Model Configuration", open=False):
+            gr.Markdown(f"""
+            - **Detection Threshold:** {DETECTION_THRESHOLD}
+            - **Similarity Threshold:** 0.6 (for person re-identification)
+            - **Device:** {"CUDA" if torch.cuda.is_available() else "CPU"}
+            - **Output Directory:** {OUTPUT_DIR}
+            """)
+        with gr.Accordion("🔧 Troubleshooting", open=False):
+            gr.Markdown("""
+            **If video doesn't display:**
+            1. Check if the output file exists in the outputs directory
+            2. Try downloading the video manually
+            3. Ensure proper video codec support
+            **Common issues:**
+            - Large video files may take time to load
+            - Some browsers may not support certain video formats
+            - Network issues can affect video streaming
+            """)
+    return demo
+# ========== Launch ==========
+if __name__ == "__main__":
+    demo = create_interface()
+    # Add file serving for outputs directory
+    demo.launch(
+        share=False,
+        server_name="127.0.0.1",
+        server_port=7860,
+        show_error=True
+    )

detection.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:04f78656185b52201e8bb37ac0990901ccbfcb4b1455c3f514ea18adc702672c
+size 40485178

detection.py ADDED Viewed

	@@ -0,0 +1,215 @@

+import cv2
+from django import conf
+import numpy as np
+from ultralytics import YOLO
+from insightface.app import FaceAnalysis
+import torchreid
+import torch
+# Configuration
+DETECTION_THRESHOLD = 0.75  # Confidence threshold for person detection
+# =============================================================================
+# MODEL INITIALIZATION
+# =============================================================================
+# Load YOLOv8 model with ByteTrack tracker for person detection and tracking
+# YOLOv8 handles object detection while ByteTrack provides consistent tracking IDs
+model = YOLO(r'detection.pt')  # Replace with your trained model path
+# Initialize InsightFace for facial feature extraction
+# Uses buffalo_l model which provides high-quality face embeddings
+face_app = FaceAnalysis(name='buffalo_l', providers=['CUDAExecutionProvider'])
+face_app.prepare(ctx_id=0)  # Prepare for GPU inference
+# Initialize TorchReID for full-body person re-identification
+# OSNet is a lightweight but effective model for person ReID
+reid_extractor = torchreid.utils.FeatureExtractor(
+    model_name='osnet_x0_25',
+    model_path='osnet_x0_25_market1501.pth',  # Pre-trained on Market1501 dataset
+    device='cuda' if torch.cuda.is_available() else 'cpu'
+)
+# =============================================================================
+# GLOBAL VARIABLES FOR PERSON RE-IDENTIFICATION
+# =============================================================================
+# Storage for known person embeddings and their assigned global IDs
+known_embeddings = []  # List of combined face+body embeddings
+known_ids = []         # Corresponding global IDs for each embedding
+next_global_id = 1     # Counter for assigning new global IDs
+# Mapping from ByteTrack tracker IDs to global person IDs
+# This helps maintain consistency when tracker IDs change
+track_to_global = {}
+# =============================================================================
+# VIDEO INPUT/OUTPUT SETUP
+# =============================================================================
+# Initialize video capture and output writer
+cap = cv2.VideoCapture("demo.mp4")  # Input video file
+width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
+height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
+fps = cap.get(cv2.CAP_PROP_FPS)
+# Create output video writer with same properties as input
+out = cv2.VideoWriter("output.mp4", cv2.VideoWriter_fourcc(*"mp4v"), fps, (width, height))
+# =============================================================================
+# MAIN PROCESSING LOOP
+# =============================================================================
+while True:
+    ret, frame = cap.read()
+    if not ret:
+        break  # End of video
+    # Run YOLOv8 detection with ByteTrack tracking
+    # persist=True maintains tracking across frames
+    results = model.track(frame, tracker="bytetrack.yaml", persist=True,
+                         verbose=False, conf=DETECTION_THRESHOLD)
+    # Process each detection result
+    for result in results:
+        # Extract bounding boxes in (x1, y1, x2, y2) format
+        boxes = result.boxes.xyxy.cpu().numpy()
+        # Extract tracking IDs if available
+        if result.boxes.id is not None:
+            track_ids = result.boxes.id.int().cpu().tolist()
+        else:
+            # No tracking IDs available, assign None for each detection
+            track_ids = [None] * len(boxes)
+        # Process each detected person
+        for box, track_id in zip(boxes, track_ids):
+            x1, y1, x2, y2 = map(int, box)
+            # Crop the person from the frame
+            person_crop = frame[y1:y2, x1:x2]
+            # Initialize embedding variables
+            face_embedding = None
+            body_embedding = None
+            # =============================================================
+            # FACE EMBEDDING EXTRACTION
+            # =============================================================
+            # Extract face embedding using InsightFace
+            faces = face_app.get(person_crop)
+            if faces:
+                # Use the first detected face (most confident)
+                face_embedding = faces[0].embedding
+            # =============================================================
+            # BODY EMBEDDING EXTRACTION
+            # =============================================================
+            # Extract body embedding using TorchReID
+            try:
+                # TorchReID expects 128x256 RGB input
+                body_input = cv2.resize(person_crop, (128, 256))
+                body_input = cv2.cvtColor(body_input, cv2.COLOR_BGR2RGB)
+                # Extract features and convert to numpy
+                body_embedding = reid_extractor(body_input)[0].cpu().numpy()
+            except:
+                # Handle cases where crop is too small or invalid
+                pass
+            # =============================================================
+            # EMBEDDING COMBINATION AND PERSON MATCHING
+            # =============================================================
+            # Combine face and body embeddings for robust person representation
+            embedding = None
+            if face_embedding is not None and body_embedding is not None:
+                # Concatenate both embeddings for maximum distinctiveness
+                embedding = np.concatenate((face_embedding, body_embedding)).astype(np.float32)
+            elif face_embedding is not None:
+                # Use only face embedding if body embedding failed
+                embedding = face_embedding.astype(np.float32)
+            elif body_embedding is not None:
+                # Use only body embedding if face detection failed
+                embedding = body_embedding.astype(np.float32)
+            # Assign global ID based on embedding similarity
+            if embedding is not None:
+                match_found = False
+                # Search for similar embeddings among known people
+                if known_embeddings:
+                    # Only compare embeddings of the same dimension
+                    matching_embeddings = [
+                        (emb, gid) for emb, gid in zip(known_embeddings, known_ids)
+                        if emb.shape[0] == embedding.shape[0]
+                    ]
+                    if matching_embeddings:
+                        embs, gids = zip(*matching_embeddings)
+                        embs = np.array(embs)
+                        # Calculate cosine similarity with all known embeddings
+                        sims = np.dot(embs, embedding) / (
+                            np.linalg.norm(embs, axis=1) * np.linalg.norm(embedding) + 1e-6
+                        )
+                        # Find the best match
+                        best_match = np.argmax(sims)
+                        if sims[best_match] > 0.6:  # Similarity threshold
+                            global_id = gids[best_match]
+                            match_found = True
+                # If no match found, assign new global ID
+                if not match_found:
+                    global_id = next_global_id
+                    next_global_id += 1
+                    known_embeddings.append(embedding)
+                    known_ids.append(global_id)
+                # Update tracker ID to global ID mapping
+                if track_id is not None:
+                    track_to_global[track_id] = global_id
+                display_id = global_id
+            else:
+                # No usable embedding available, fallback to tracker ID
+                global_id = track_to_global.get(track_id, f"T{track_id}")
+                display_id = global_id
+            # =============================================================
+            # VISUALIZATION
+            # =============================================================
+            # Draw bounding box around detected person
+            cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
+            # Display the global ID above the bounding box
+            cv2.putText(frame, f"ID {display_id}", (x1, y1 - 10),
+                        cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)
+    # =============================================================================
+    # OUTPUT AND DISPLAY
+    # =============================================================================
+    # Show the frame with tracking results
+    cv2.imshow("Tracking + ReID", frame)
+    # Break loop if 'q' key is pressed
+    if cv2.waitKey(1) & 0xFF == ord('q'):
+        break
+    # Write frame to output video
+    out.write(frame)
+# =============================================================================
+# CLEANUP
+# =============================================================================
+# Release video capture and writer resources
+cap.release()
+out.release()
+cv2.destroyAllWindows()

requirements.txt ADDED Viewed

	@@ -0,0 +1,14 @@

+--extra-index-url https://download.pytorch.org/whl/cu118
+torch==2.4.1
+torchvision==0.19.1
+torchaudio==2.4.1
+gradio==5.35.0
+insightface==0.7.3
+onnxruntime-gpu==1.14.1
+torchreid==0.2.5
+ultralytics==8.3.161
+gdown==5.2.0
+gradio==5.35.0
+lap==0.5.12
+tensorboard==2.19.0