Image Classification
ultralytics
yolo
drowsiness-detection
computer-vision

YOLOv11 Model for Drowsiness Detection

This repository contains a YOLO classification model fine-tuned to detect driver drowsiness from images. The model classifies input images into two categories: Drowsy and Non Drowsy (Awake).

This model was trained using the ultralytics framework and demonstrates high performance on an unseen test set, making it a reliable tool for safety applications.

Model Details

  • Base Model: yolo11x-cls (from the Ultralytics v8 ecosystem)
  • Fine-tuned on: A combined dataset for driver drowsiness detection.
  • Classes: Drowsy, Non Drowsy
  • Framework: PyTorch, Ultralytics

How to Get Started

You can easily use this model with the ultralytics library.

# Install ultralytics
!pip install ultralytics

from ultralytics import YOLO

# Load the model from the Hugging Face Hub
model = YOLO('your-username/your-repo-name')

# Run inference on an image
image_path = 'path/to/your/image.jpg'
results = model.predict(image_path)

# Print the top prediction
probs = results[0].probs
top1_class_index = probs.top1
top1_confidence = probs.top1conf
class_name = model.names[top1_class_index]

print(f"Prediction: {class_name} with confidence {top1_confidence:.4f}")

Training Procedure

The model was fine-tuned on a large dataset of driver images. The training process involved:

  • Data Augmentation: Standard augmentations like random flips, color jitter (HSV), and scaling were applied.
  • Transfer Learning: The model was initialized with weights pretrained on a large-scale dataset, enabling rapid convergence.

Key Hyperparameters

  • Image Size: 224x224
  • Batch Size: 185 (auto-tuned)
  • Optimizer: SGD with momentum

Training Results

Evaluation

The model was evaluated on a completely unseen test set to ensure a fair assessment of its generalization capabilities.

Key Performance Metrics

Metric Value Description
Accuracy 99.80% Overall correctness on the test set.
APCER 0.00% Rate of 'Drowsy' drivers missed (False Negatives).
BPCER 0.41% Rate of 'Non Drowsy' drivers flagged (False Positives).
ACER 0.21% Average of APCER and BPCER.

APCER (Attack Presentation Classification Error Rate) is the most critical safety metric.

Confusion Matrix

Model Explainability (Grad-CAM)

To ensure the model is focusing on relevant facial features, Grad-CAM was used. The heatmaps confirm that the model's predictions are primarily based on the eye and mouth regions, as expected.

Grad-CAM

Intended Use and Limitations

This model is intended as a proof-of-concept for driver safety systems. It should not be used as the sole mechanism for preventing accidents. Real-world performance may vary based on lighting conditions, camera angles, occlusions (e.g., sunglasses), and individual differences.

This model card is based on the training notebook yolov11_drowsiness.ipynb.

Downloads last month
49
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support