Leveraging Pre-Trained Acoustic Feature Extractor For Affective Vocal Bursts Tasks

2022-12-21APSIPA 2022Code Available0· sign in to hype

Bagus Tris Atmaja, Akira Sasou

Code Available — Be the first to reproduce this paper.

Code

github.com/bagustris/a-vb2022
pytorch★ 3

Abstract

Understanding humans’ emotions is a challenge for computers. Nowadays, research on speech emotion recognition has been conducted progressively. Instead of a speech, affective information may lay on short vocal bursts (i.e., cry when sad). In this study, we evaluated a recent self-supervised learning model to extract acoustic embedding for affective vocal bursts tasks. There are four tasks investigated on both regression and classification problems. Using similar architectures, we found the effectiveness of using a pre-trained model over the baseline methods. The study is further expanded to evaluate the different number of seeds, patiences, and batch sizes on the performance of the four tasks.

Tasks

Emotion Recognition regression Self-Supervised Learning Speech Emotion Recognition

Leveraging Pre-Trained Acoustic Feature Extractor For Affective Vocal Bursts Tasks

Code

Abstract

Tasks

Reproductions