SOTAVerified

Audio Synthesis

Papers

Showing 125 of 127 papers

TitleStatusHype
MIDI-VALLE: Improving Expressive Piano Performance Synthesis Through Neural Codec Language Modelling0
Step-by-Step Video-to-Audio Synthesis via Negative Audio Guidance0
Diffusion-Based Symbolic Regression0
SpecMaskFoley: Steering Pretrained Spectral Masked Generative Transformer Toward Synchronized Video-to-audio Synthesis via ControlNet0
Communication-Efficient Diffusion Denoising Parallelization via Reuse-then-Predict Mechanism0
DPN-GAN: Inducing Periodic Activations in Generative Adversarial Networks for High-Fidelity Audio Synthesis0
Fast Differentiable Modal Simulation of Non-linear Strings, Membranes, and PlatesCode1
Supervising 3D Talking Head Avatars with Analysis-by-Audio-Synthesis0
TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis0
Designing Neural Synthesizers for Low-Latency Interaction0
Long-Video Audio Synthesis with Multi-Agent Collaboration0
Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision0
XAttnMark: Learning Robust Audio Watermarking with Cross-Attention0
Generative diffusion model with inverse renormalization group flowsCode1
Customized Condition Controllable Generation for Video Soundtrack0
Tri-Ergon: Fine-grained Video-to-Audio Generation with Multi-modal Conditions and LUFS Control0
MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio SynthesisCode7
CSSinger: End-to-End Chunkwise Streaming Singing Voice Synthesis System Based on Conditional Variational Autoencoder0
Zero-Shot Mono-to-Binaural Speech Synthesis0
Generalized Diffusion Model with Adjusted Offset Noise0
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified FlowsCode2
VQalAttent: a Transparent Speech Generation Pipeline based on Transformer-learned VQ-VAE Latent Space0
Annotation-Free MIDI-to-Audio Synthesis via Concatenative Synthesis and Generative Refinement0
Array2BR: An End-to-End Noise-immune Binaural Audio Synthesis from Microphone-array Signals0
Where are we in audio deepfake detection? A systematic analysis over generative and detection modelsCode1
Show:102550
← PrevPage 1 of 6Next →

No leaderboard results yet.