YOLOv11 Model for Drowsiness Detection
This repository contains a YOLO classification model fine-tuned to detect driver drowsiness from images. The model classifies input images into two categories: Drowsy
and Non Drowsy
(Awake).
This model was trained using the ultralytics
framework and demonstrates high performance on an unseen test set, making it a reliable tool for safety applications.
Model Details
- Base Model:
yolo11x-cls
(from the Ultralytics v8 ecosystem) - Fine-tuned on: A combined dataset for driver drowsiness detection.
- Classes:
Drowsy
,Non Drowsy
- Framework: PyTorch, Ultralytics
How to Get Started
You can easily use this model with the ultralytics
library.
# Install ultralytics
!pip install ultralytics
from ultralytics import YOLO
# Load the model from the Hugging Face Hub
model = YOLO('your-username/your-repo-name')
# Run inference on an image
image_path = 'path/to/your/image.jpg'
results = model.predict(image_path)
# Print the top prediction
probs = results[0].probs
top1_class_index = probs.top1
top1_confidence = probs.top1conf
class_name = model.names[top1_class_index]
print(f"Prediction: {class_name} with confidence {top1_confidence:.4f}")
Training Procedure
The model was fine-tuned on a large dataset of driver images. The training process involved:
- Data Augmentation: Standard augmentations like random flips, color jitter (HSV), and scaling were applied.
- Transfer Learning: The model was initialized with weights pretrained on a large-scale dataset, enabling rapid convergence.
Key Hyperparameters
- Image Size: 224x224
- Batch Size: 185 (auto-tuned)
- Optimizer: SGD with momentum
Evaluation
The model was evaluated on a completely unseen test set to ensure a fair assessment of its generalization capabilities.
Key Performance Metrics
Metric | Value | Description |
---|---|---|
Accuracy | 99.80% | Overall correctness on the test set. |
APCER | 0.00% | Rate of 'Drowsy' drivers missed (False Negatives). |
BPCER | 0.41% | Rate of 'Non Drowsy' drivers flagged (False Positives). |
ACER | 0.21% | Average of APCER and BPCER. |
APCER (Attack Presentation Classification Error Rate) is the most critical safety metric.
Model Explainability (Grad-CAM)
To ensure the model is focusing on relevant facial features, Grad-CAM was used. The heatmaps confirm that the model's predictions are primarily based on the eye and mouth regions, as expected.
Intended Use and Limitations
This model is intended as a proof-of-concept for driver safety systems. It should not be used as the sole mechanism for preventing accidents. Real-world performance may vary based on lighting conditions, camera angles, occlusions (e.g., sunglasses), and individual differences.
This model card is based on the training notebook yolov11_drowsiness.ipynb
.
- Downloads last month
- 49