Sample-level CNN Architectures for Music Auto-tagging Using Raw Waveforms

2017-10-28Code Available0· sign in to hype

Taejun Kim, Jongpil Lee, Juhan Nam

Code Available — Be the first to reproduce this paper.

Code

github.com/tae-jun/resemul
OfficialIn papertf★ 58
github.com/jaehwlee/tf2-music-tagging-models
tf★ 0

Abstract

Recent work has shown that the end-to-end approach using convolutional neural network (CNN) is effective in various types of machine learning tasks. For audio signals, the approach takes raw waveforms as input using an 1-D convolution layer. In this paper, we improve the 1-D CNN architecture for music auto-tagging by adopting building blocks from state-of-the-art image classification models, ResNets and SENets, and adding multi-level feature aggregation to it. We compare different combinations of the modules in building CNN architectures. The results show that they achieve significant improvements over previous state-of-the-art models on the MagnaTagATune dataset and comparable results on Million Song Dataset. Furthermore, we analyze and visualize our model to show how the 1-D CNN operates.

Tasks

General Classification image-classification Music Auto-Tagging

Sample-level CNN Architectures for Music Auto-tagging Using Raw Waveforms

Code

Abstract

Tasks

Reproductions