SOTAVerified

Audio Spectrogram Representations for Processing with Convolutional Neural Networks

2017-06-29Unverified0· sign in to hype

L. Wyse

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

One of the decisions that arise when designing a neural network for any application is how the data should be represented in order to be presented to, and possibly generated by, a neural network. For audio, the choice is less obvious than it seems to be for visual images, and a variety of representations have been used for different applications including the raw digitized sample stream, hand-crafted features, machine discovered features, MFCCs and variants that include deltas, and a variety of spectral representations. This paper reviews some of these representations and issues that arise, focusing particularly on spectrograms for generating audio using neural networks for style transfer.

Tasks

Reproductions