SOTAVerified

Audio Generation

Audio generation (synthesis) is the task of generating raw audio such as speech.

( Image credit: MelNet )

Papers

Showing 221230 of 270 papers

TitleStatusHype
Soundify: Matching Sound Effects to Video0
Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video0
RefineGAN: Universally Generating Waveform Better than Ground Truth with Highly Accurate Pitch and Intensity ResponsesCode1
Unsupervised Source Separation By Steering Pretrained Music ModelsCode1
Taming Visually Guided Sound GenerationCode1
An investigation of pre-upsampling generative modelling and Generative Adversarial Networks in audio super resolution0
Depth Infused Binaural Audio Generation using Hierarchical Cross-Modal Attention0
Neural Waveshaping SynthesisCode1
CRASH: Raw Audio Score-based Generative Modeling for Controllable High-resolution Drum Sound Synthesis0
PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior0
Show:102550
← PrevPage 23 of 27Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1AudioGenFD_openl3185.53Unverified
2AudioLDM2-largeFD_openl3158.04Unverified
3Stable Audio 2.0FD_openl3110.62Unverified
4Stable AudioFD_openl3103.66Unverified
5ETTAFD_openl380.13Unverified
6TangoFlux-baseFD_openl379.7Unverified
7Stable Audio OpenFD_openl378.24Unverified
8TangoFluxFD_openl375.1Unverified
9ETTA-FT-AC-100kFD_openl361.79Unverified
10DiffsoundFAD7.75Unverified
#ModelMetricClaimedVerifiedStatus
1VAB-Encodec (Ours)Bits per byte40Unverified
2Sparse Transformer 152M (strided)Bits per byte1.97Unverified
#ModelMetricClaimedVerifiedStatus
1SymphonyNet Human listening average results3.5Unverified