SOTAVerified

Multi-Modality in Music: Predicting Emotion in Music from High-Level Audio Features and Lyrics

2023-02-26Code Available0· sign in to hype

Tibor Krols, Yana Nikolova, Ninell Oldenburg

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

This paper aims to test whether a multi-modal approach for music emotion recognition (MER) performs better than a uni-modal one on high-level song features and lyrics. We use 11 song features retrieved from the Spotify API, combined lyrics features including sentiment, TF-IDF, and Anew to predict valence and arousal (Russell, 1980) scores on the Deezer Mood Detection Dataset (DMDD) (Delbouys et al., 2018) with 4 different regression models. We find that out of the 11 high-level song features, mainly 5 contribute to the performance, multi-modal features do better than audio alone when predicting valence. We made our code publically available.

Tasks

Reproductions