|
--- |
|
language: |
|
- en |
|
library_name: keras |
|
pipeline_tag: image-classification |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
This modelcard aims to classify emotions into one of seven categories: anger, happy, sad, fear, surprise, disgust, neutral. |
|
|
|
## Model Details |
|
|
|
Dataset: |
|
|
|
- Train: |
|
Happy - 14,379 / Angry - 7988 / Disgust - 872 / Sad - 9768 / Neutral - 9947 / Fear - 8200 / Surprise - 6376 |
|
|
|
- Test: |
|
Happy - 3599 / Angry - 1918 / Disgust - 222 / Sad - 2386 / Neutral - 2449 / Fear - 2042 / Surprise - 1628 |
|
|
|
- Val: |
|
Happy - 2880 / Angry - 1600 / Disgust - 172 / Sad - 1954 / Neutral - 1990 / Fear - 1640 / Surprise - 1628 |
|
|
|
Model: |
|
|
|
1. Transfer learning using MobileNetv2 with 2 additional Dense layers and an output layer with softmax activation function. |
|
2. Used weights to adjust for class imbalances. |
|
3. Total Params: 3,675,823 |
|
4. Trainable Params: 136,839 |
|
5. Accuracy: 0.823 | Precision: 0.825 | Recall: 0.823 | F1: 0.821 |
|
|
|
## Room for Improvement: |
|
|
|
This model was created with extremely limited hardware acceleration (GPU) resources. Therefore, it is high likely that evaluation metrics that surpass the 95% mark can be achieved in the following manner: |
|
|
|
1. MobileNetv2 was used for its fast inference and low latency but perhaps, with more resources, a more suitable base model can be found. |
|
2. Data augmentation in order to better correct for class imbalances. |
|
3. Using learning rate decay to train for longer (with lower LR) after nearing local minima (aprox 60 epochs). |
|
4. Error Analysis |
|
|
|
|
|
## Uses |
|
|
|
Cannot be used for commercial purposes in the EU. |
|
|
|
### Direct Use |
|
|
|
Combine with the Open CV haar casacade for face detection. |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the script below to get started with the model locally on your device's camera: |
|
|
|
import cv2 |
|
import numpy as np |
|
import tensorflow as tf |
|
|
|
def display_emotion(frame, model): |
|
font = cv2.FONT_HERSHEY_SIMPLEX |
|
font_scale = 1.5 |
|
text_color = (0, 0, 255) |
|
x, y, w, h = 0, 0, 175, 75 |
|
|
|
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) |
|
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml') |
|
faces = face_cascade.detectMultiScale(gray, 1.1, 4) |
|
|
|
for x, y, w, h in faces: |
|
roi_gray = gray[y:y+h, x:x+w] |
|
roi_color = frame[y:y+h, x:x+w] |
|
cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2) # Green square |
|
faces = face_cascade.detectMultiScale(roi_gray) |
|
|
|
if len(faces) == 0: |
|
print("Face not detected...") |
|
else: |
|
for (ex, ey, ew, eh) in faces: |
|
face_roi = roi_color[ey:ey+eh, ex:ex+ew] |
|
|
|
resized_image = cv2.resize(face_roi, (224, 224)) |
|
final_image = np.expand_dims(resized_image, axis=0) |
|
|
|
predictions = model.predict(final_image) |
|
class_labels = ['angry', 'disgust', 'fear', 'happy', 'neutral', 'sad', 'surprise'] |
|
predicted_label = class_labels[np.argmax(predictions)] |
|
|
|
# Black background rectangle |
|
cv2.rectangle(frame, (x, y), (x+w, y-25), (0, 0, 0), -1) |
|
# Add text |
|
cv2.putText(frame, predicted_label, (x, y-10), font, 0.7, text_color, 2) |
|
cv2.rectangle(frame, (x, y), (x+w, y+h), text_color) |
|
|
|
return frame |
|
|
|
def main(): |
|
model = tf.keras.models.load_model('emotion_detection.keras') |
|
cap = cv2.VideoCapture(1) |
|
|
|
if not cap.isOpened(): |
|
cap = cv2.VideoCapture(0) |
|
if not cap.isOpened(): |
|
raise IOError("Cannot open webcam") |
|
|
|
while True: |
|
ret, frame = cap.read() |
|
if not ret: |
|
break |
|
|
|
frame = display_emotion(frame, model) |
|
cv2.imshow('Facial Expression Recognition', frame) |
|
|
|
if cv2.waitKey(2) & 0xFF == ord('q'): |
|
break |
|
|
|
cap.release() |
|
cv2.destroyAllWindows() |
|
|
|
if __name__ == "__main__": |
|
main() |
|
|
|
|
|
|
|
|
|
|
|
#### Preprocessing [optional] |
|
|
|
MobileNetv2 recieves image inputs of size (224, 224) |
|
|
|
#### Speeds, Sizes, Times [optional] |
|
|
|
Latency (local demo, no GPU): 39 ms/step |
|
|
|
## Model Card Authors [optional] |
|
|
|
Ronny Nehme |