Stable Audio Open

2024-07-19Code Available7· sign in to hype

Zach Evans, Julian D. Parker, CJ Carr, Zack Zukowski, Josiah Taylor, Jordi Pons

Code Available — Be the first to reproduce this paper.

Code

github.com/stability-ai/stable-audio-tools
OfficialIn paperpytorch★ 3,639

Abstract

Open generative models are vitally important for the community, allowing for fine-tunes and serving as baselines when presenting new models. However, most current text-to-audio models are private and not accessible for artists and researchers to build upon. Here we describe the architecture and training process of a new open-weights text-to-audio model trained with Creative Commons data. Our evaluation shows that the model's performance is competitive with the state-of-the-art across various metrics. Notably, the reported FDopenl3 results (measuring the realism of the generations) showcase its potential for high-quality stereo sound synthesis at 44.1kHz.

Tasks

Audio Generation Text-to-Music Generation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
AudioCaps	Stable Audio Open	FD_openl3	78.24	—	Unverified

Stable Audio Open

Code

Abstract

Tasks

Benchmark Results

Reproductions