--- language: - en library_name: keras pipeline_tag: image-classification --- # Model Card for Model ID This modelcard aims to classify emotions into one of seven categories: anger, happy, sad, fear, surprise, disgust, neutral. ## Model Details Dataset: - Train: Happy - 14,379 / Angry - 7988 / Disgust - 872 / Sad - 9768 / Neutral - 9947 / Fear - 8200 / Surprise - 6376 - Test: Happy - 3599 / Angry - 1918 / Disgust - 222 / Sad - 2386 / Neutral - 2449 / Fear - 2042 / Surprise - 1628 - Val: Happy - 2880 / Angry - 1600 / Disgust - 172 / Sad - 1954 / Neutral - 1990 / Fear - 1640 / Surprise - 1628 Model: 1. Transfer learning using MobileNetv2 with 2 additional Dense layers and an output layer with softmax activation function. 2. Used weights to adjust for class imbalances. 3. Total Params: 3,675,823 4. Trainable Params: 136,839 5. Accuracy: 0.823 | Precision: 0.825 | Recall: 0.823 | F1: 0.821 Room for Improvement: This model was created with extremely limited hardware acceleration (GPU) resources. Therefore, it is high likely that evaluation metrics that surpass the 95% mark can be achieved in the following manner: 1. MobileNetv2 was used for its fast inference and low latency but perhaps, with more resources, a more suitable base model can be found. 2. Data augmentation in order to better correct for class imbalances. 3. Using a learning rate scheduler to train for longer (with lower LR) after nearing local minima (aprox 60 epochs). ## Uses Cannot be used for commercial purposes in the EU. ### Direct Use Combine with the Open CV haar casacade for face detection. ## How to Get Started with the Model Use the code below to get started with the model locally: import cv2 import numpy as np import tensorflow as tf def display_emotion(frame, model): font = cv2.FONT_HERSHEY_SIMPLEX font_scale = 1.5 text_color = (0, 0, 255) x, y, w, h = 0, 0, 175, 75 gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml') faces = face_cascade.detectMultiScale(gray, 1.1, 4) for x, y, w, h in faces: roi_gray = gray[y:y+h, x:x+w] roi_color = frame[y:y+h, x:x+w] cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2) # Green square faces = face_cascade.detectMultiScale(roi_gray) if len(faces) == 0: print("Face not detected...") else: for (ex, ey, ew, eh) in faces: face_roi = roi_color[ey:ey+eh, ex:ex+ew] resized_image = cv2.resize(face_roi, (224, 224)) final_image = np.expand_dims(resized_image, axis=0) predictions = model.predict(final_image) class_labels = ['angry', 'disgust', 'fear', 'happy', 'neutral', 'sad', 'surprise'] predicted_label = class_labels[np.argmax(predictions)] # Black background rectangle cv2.rectangle(frame, (x, y), (x+w, y-25), (0, 0, 0), -1) # Add text cv2.putText(frame, predicted_label, (x, y-10), font, 0.7, text_color, 2) cv2.rectangle(frame, (x, y), (x+w, y+h), text_color) return frame def main(): model = tf.keras.models.load_model('best_model.keras') cap = cv2.VideoCapture(1) if not cap.isOpened(): cap = cv2.VideoCapture(0) if not cap.isOpened(): raise IOError("Cannot open webcam") while True: ret, frame = cap.read() if not ret: break frame = display_emotion(frame, model) cv2.imshow('Facial Expression Recognition', frame) if cv2.waitKey(2) & 0xFF == ord('q'): break cap.release() cv2.destroyAllWindows() if __name__ == "__main__": main() ### Training Data Dataset used: FER (available on Kaggle) #### Preprocessing [optional] MobileNetv2 recieves image inputs of size (224, 224) #### Speeds, Sizes, Times [optional] Latency (local demo, no GPU): 39 ms/step ## Model Card Authors [optional] Ronny Nehme