CLIP-ViT-Large-Patch14 Fine-Tuned on FER-2013 Dataset

This model is a fine-tuned version of the CLIP-ViT-Large-Patch14 model on the FER-2013 dataset.

Overview

The model is designed for facial emotion recognition and can classify images into the following 7 primary emotions:

Neutral
Anger
Happiness
Fear
Disgust
Sadness
Surprise

It leverages the powerful vision encoder from the original CLIP model, developed by OpenAI to enable zero-shot image classification. The model has been further fine-tuned specifically for facial emotion recognition using the FER-2013 dataset.

⚠️ Important Note: This model should be used strictly for research purposes. Deployment in real-world or commercial applications should involve thorough domain-specific testing and fairness evaluation due to potential biases and performance variability.

Model Details

Base Model: openai/clip-vit-large-patch14
Fine-tuned on: FER-2013
Task: Facial Emotion Recognition
Number of Classes: 7

License

Apache-2.0 License

Intended Use

This model is intended for researchers and developers working on:

Facial expression recognition
Emotion detection in images
Human-computer interaction studies
Psychological and behavioral modeling

Limitations

The model was trained exclusively on static grayscale face images aligned in frontal pose.
Performance may degrade significantly with occluded faces, side profiles, or low-resolution images.
The model has not been evaluated for fairness across different demographics such as race, gender, or age groups.
It may exhibit bias depending on how class labels are defined and constructed.