Speech Emotion Recognition with Multi-Task Learning

2021-09-06Interspeech 2021Code Available1· sign in to hype

Cai, Xingyu Yuan, Jiahong Zheng, Renjie Huang, Liang Church, Kenneth

Code Available — Be the first to reproduce this paper.

Code

github.com/TideDancer/interspeech21_emotion
pytorch★ 112

Abstract

Speech emotion recognition (SER) classifies speech into emotion categories such as: Happy, Angry, Sad and Neutral. Recently , deep learning has been applied to the SER task. This paper proposes a multi-task learning (MTL) framework to simultaneously perform speech-to-text recognition and emotion classification, with an end-to-end deep neural model based on wav2vec-2.0. Experiments on the IEMOCAP benchmark show that the proposed method achieves the state-of-the-art performance on the SER task. In addition, an ablation study establishes the effectiveness of the proposed MTL framework.

Tasks

Emotion Classification Emotion Recognition Multi-Task Learning Speech Emotion Recognition Speech-to-Text

Speech Emotion Recognition with Multi-Task Learning

Code

Abstract

Tasks

Reproductions