Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,136 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
library_name: keras
|
5 |
+
pipeline_tag: image-classification
|
6 |
+
---
|
7 |
+
|
8 |
+
# Model Card for Model ID
|
9 |
+
|
10 |
+
This modelcard aims to classify emotions into one of seven categories: anger, happy, sad, fear, surprise, disgust, neutral.
|
11 |
+
|
12 |
+
## Model Details
|
13 |
+
|
14 |
+
Dataset:
|
15 |
+
|
16 |
+
- Train:
|
17 |
+
Happy - 14,379 / Angry - 7988 / Disgust - 872 / Sad - 9768 / Neutral - 9947 / Fear - 8200 / Surprise - 6376
|
18 |
+
|
19 |
+
- Test:
|
20 |
+
Happy - 3599 / Angry - 1918 / Disgust - 222 / Sad - 2386 / Neutral - 2449 / Fear - 2042 / Surprise - 1628
|
21 |
+
|
22 |
+
- Val:
|
23 |
+
Happy - 2880 / Angry - 1600 / Disgust - 172 / Sad - 1954 / Neutral - 1990 / Fear - 1640 / Surprise - 1628
|
24 |
+
|
25 |
+
Model:
|
26 |
+
|
27 |
+
1. Transfer learning using MobileNetv2 with 2 additional Dense layers and an output layer with softmax activation function.
|
28 |
+
2. Used weights to adjust for class imbalances.
|
29 |
+
3. Total Params: 3,675,823
|
30 |
+
4. Trainable Params: 136,839
|
31 |
+
5. Accuracy: 0.823 | Precision: 0.825 | Recall: 0.823 | F1: 0.821
|
32 |
+
|
33 |
+
Room for Improvement:
|
34 |
+
|
35 |
+
This model was created with extremely limited hardware acceleration (GPU) resources. Therefore, it is high likely that evaluation metrics that surpass the 95% mark can be achieved in the following manner:
|
36 |
+
|
37 |
+
1. MobileNetv2 was used for its fast inference and low latency but perhaps, with more resources, a more suitable base model can be found.
|
38 |
+
2. Data augmentation in order to better correct for class imbalances.
|
39 |
+
3. Using a learning rate scheduler to train for longer (with lower LR) after nearing local minima (aprox 60 epochs).
|
40 |
+
|
41 |
+
|
42 |
+
## Uses
|
43 |
+
|
44 |
+
Cannot be used for commercial purposes in the EU.
|
45 |
+
|
46 |
+
### Direct Use
|
47 |
+
|
48 |
+
Combine with the Open CV haar casacade for face detection.
|
49 |
+
|
50 |
+
## How to Get Started with the Model
|
51 |
+
|
52 |
+
Use the code below to get started with the model locally:
|
53 |
+
|
54 |
+
import cv2
|
55 |
+
import numpy as np
|
56 |
+
import tensorflow as tf
|
57 |
+
|
58 |
+
def display_emotion(frame, model):
|
59 |
+
font = cv2.FONT_HERSHEY_SIMPLEX
|
60 |
+
font_scale = 1.5
|
61 |
+
text_color = (0, 0, 255)
|
62 |
+
x, y, w, h = 0, 0, 175, 75
|
63 |
+
|
64 |
+
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
|
65 |
+
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
|
66 |
+
faces = face_cascade.detectMultiScale(gray, 1.1, 4)
|
67 |
+
|
68 |
+
for x, y, w, h in faces:
|
69 |
+
roi_gray = gray[y:y+h, x:x+w]
|
70 |
+
roi_color = frame[y:y+h, x:x+w]
|
71 |
+
cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2) # Green square
|
72 |
+
faces = face_cascade.detectMultiScale(roi_gray)
|
73 |
+
|
74 |
+
if len(faces) == 0:
|
75 |
+
print("Face not detected...")
|
76 |
+
else:
|
77 |
+
for (ex, ey, ew, eh) in faces:
|
78 |
+
face_roi = roi_color[ey:ey+eh, ex:ex+ew]
|
79 |
+
|
80 |
+
resized_image = cv2.resize(face_roi, (224, 224))
|
81 |
+
final_image = np.expand_dims(resized_image, axis=0)
|
82 |
+
|
83 |
+
predictions = model.predict(final_image)
|
84 |
+
class_labels = ['angry', 'disgust', 'fear', 'happy', 'neutral', 'sad', 'surprise']
|
85 |
+
predicted_label = class_labels[np.argmax(predictions)]
|
86 |
+
|
87 |
+
# Black background rectangle
|
88 |
+
cv2.rectangle(frame, (x, y), (x+w, y-25), (0, 0, 0), -1)
|
89 |
+
# Add text
|
90 |
+
cv2.putText(frame, predicted_label, (x, y-10), font, 0.7, text_color, 2)
|
91 |
+
cv2.rectangle(frame, (x, y), (x+w, y+h), text_color)
|
92 |
+
|
93 |
+
return frame
|
94 |
+
|
95 |
+
def main():
|
96 |
+
model = tf.keras.models.load_model('best_model.keras')
|
97 |
+
cap = cv2.VideoCapture(1)
|
98 |
+
|
99 |
+
if not cap.isOpened():
|
100 |
+
cap = cv2.VideoCapture(0)
|
101 |
+
if not cap.isOpened():
|
102 |
+
raise IOError("Cannot open webcam")
|
103 |
+
|
104 |
+
while True:
|
105 |
+
ret, frame = cap.read()
|
106 |
+
if not ret:
|
107 |
+
break
|
108 |
+
|
109 |
+
frame = display_emotion(frame, model)
|
110 |
+
cv2.imshow('Facial Expression Recognition', frame)
|
111 |
+
|
112 |
+
if cv2.waitKey(2) & 0xFF == ord('q'):
|
113 |
+
break
|
114 |
+
|
115 |
+
cap.release()
|
116 |
+
cv2.destroyAllWindows()
|
117 |
+
|
118 |
+
if __name__ == "__main__":
|
119 |
+
main()
|
120 |
+
|
121 |
+
|
122 |
+
### Training Data
|
123 |
+
|
124 |
+
Dataset used: FER (available on Kaggle)
|
125 |
+
|
126 |
+
#### Preprocessing [optional]
|
127 |
+
|
128 |
+
MobileNetv2 recieves image inputs of size (224, 224)
|
129 |
+
|
130 |
+
#### Speeds, Sizes, Times [optional]
|
131 |
+
|
132 |
+
Latency (local demo, no GPU): 39 ms/step
|
133 |
+
|
134 |
+
## Model Card Authors [optional]
|
135 |
+
|
136 |
+
Ronny Nehme
|