CAiRE
/

SER-wav2vec2-large-xlsr-53-eng-zho-adults

Audio Classification

speech-emotion-recognition

Inference Endpoints

Model card Files Files and versions Community

holylovenia commited on Jun 27, 2023

Commit

e235189

•

1 Parent(s): 366dc06

Update README.md

Files changed (1) hide show

README.md +40 -0

README.md CHANGED Viewed

@@ -1,3 +1,43 @@
 ---
 license: cc-by-sa-4.0
 ---

 ---
 license: cc-by-sa-4.0
+datasets:
+- Ar4ikov/iemocap_audio_text_splitted
+language:
+- en
+- zh
+metrics:
+- f1
+library_name: transformers
+pipeline_tag: audio-classification
+tags:
+- speech-emotion-recognition
 ---
+# Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition
+Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on English and Chinese data from adult speakers.
+The model is trained on the training sets of [CREMA-D](https://github.com/CheyneyComputerScience/CREMA-D), [ESD](https://github.com/HLTSingapore/Emotional-Speech-Data), [IEMOCAP](https://sail.usc.edu/iemocap/iemocap_release.htm), and [TESS](https://www.kaggle.com/datasets/ejlok1/toronto-emotional-speech-set-tess).
+When using this model, make sure that your speech input is sampled at 16kHz.
+The scripts used for training and evaluation can be found here:
+[https://github.com/HLTCHKUST/elderly_ser/tree/main](https://github.com/HLTCHKUST/elderly_ser/tree/main)
+## Evaluation Results
+For the details (e.g., the statistics of `train`, `valid`, and `test` data), please refer to our paper on [arXiv](https://arxiv.org/abs/2306.14517).
+It also provides the model's speech emotion recognition performances on: English-All, Chinese-All, English-Elderly, Chinese-Elderly, English-Adults, Chinese-Adults.
+## Citation
+Our paper will be published at INTERSPEECH 2023. In the meantime, you can find our paper on [arXiv](https://arxiv.org/abs/2306.14517).
+If you find our work useful, please consider citing our paper as follows:
+```
+@misc{cahyawijaya2023crosslingual,
+      title={Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition},
+      author={Samuel Cahyawijaya and Holy Lovenia and Willy Chung and Rita Frieske and Zihan Liu and Pascale Fung},
+      year={2023},
+      eprint={2306.14517},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```