iwonachristop's picture
Fix text formatting
2921097

A newer version of the Gradio SDK is available: 5.45.0

Upgrade

πŸ“ About

CAMEO (Collection of Multilingual Emotional Speech Corpora) is a benchmark dataset designed to support research in Speech Emotion Recognition (SER) - especially in multilingual and cross-lingual settings.

The collection brings together 13 emotional speech datasets covering 8 languages, including English, German, Spanish, French, Serbian, and more. In total, it contains 41,265 audio samples, with each sample annotated for emotion, and in most cases, also for speaker ID, gender, and age.

Here are a few quick facts about the dataset:

  • Over 33% of the samples are in English.
  • 17 distinct emotional states are represented across datasets.
  • 93.5% of samples fall under the seven primary emotions: neutral, anger, sadness, surprise, happiness, disgust, and fear.
  • Gender annotations are available for over 92% of samples.

All datasets included in CAMEO are openly available. We've made the full collection accessible on Hugging Face, along with metadata, tools, and a leaderboard for evaluation.

πŸ”— View the CAMEO Dataset on Hugging Face

Whether you're building SER models or exploring emotion understanding across languages, CAMEO is here to support your research.