SOTAVerified

Deeper Convolutional Neural Networks and Broad Augmentation Policies Improve Performance in Musical Key Estimation

2021-11-07Proceedings of the International Society for Music Information Retrieval Conference (ISMIR) 2021Code Available1· sign in to hype

Stefan Andreas Baumann

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

In recent years, complex convolutional neural network architectures such as the Inception architecture have been shown to offer significant improvements over previous architectures in image classification. So far, little work has been done applying these architectures to music information retrieval tasks, with most models still relying on sequential neural network architectures. In this paper, we adapt the Inception architecture to the specific needs of harmonic music analysis and use it to create a model (InceptionKeyNet) for the task of key estimation. We then show that the resulting model can significantly outperform state-of-the-art single-task models when trained on the same datasets. Additionally, we evaluate a broad range of augmentation methods and find that extending augmentation policies to include a more diverse set of methods further improves accuracy. Finally, we train both the proposed and state-of-the-art single-task models on differently sized training datasets and different augmentation policies and compare the differences in generalization performance.

Tasks

Reproductions