| Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling | Nov 14, 2019 | Audio GenerationCausal Language Modeling | CodeCode Available | 0 | 5 |
| An Initial Exploration: Learning to Generate Realistic Audio for Silent Video | Aug 23, 2023 | Audio Generation | CodeCode Available | 0 | 5 |
| Smoothed Dilated Convolutions for Improved Dense Prediction | Aug 27, 2018 | Audio GenerationMachine Translation | CodeCode Available | 0 | 5 |
| Score and Lyrics-Free Singing Voice Generation | Dec 26, 2019 | Audio GenerationSinging Voice Synthesis | CodeCode Available | 0 | 5 |
| PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior | Jun 11, 2021 | Audio GenerationDenoising | CodeCode Available | 0 | 5 |
| Can Synthetic Audio From Generative Foundation Models Assist Audio Recognition and Speech Modeling? | Jun 13, 2024 | Audio GenerationData Augmentation | CodeCode Available | 0 | 5 |
| Audio Super Resolution using Neural Networks | Aug 2, 2017 | Audio GenerationAudio Super-Resolution | CodeCode Available | 0 | 5 |
| LVNS-RAVE: Diversified audio generation with RAVE and Latent Vector Novelty Search | Apr 22, 2024 | Audio GenerationDeep Learning | CodeCode Available | 0 | 5 |
| Retrieval-Augmented Neural Field for HRTF Upsampling and Personalization | Jan 22, 2025 | Audio GenerationRetrieval | CodeCode Available | 0 | 5 |
| Sounding Video Generator: A Unified Framework for Text-guided Sounding Video Generation | Mar 29, 2023 | Audio GenerationContrastive Learning | CodeCode Available | 0 | 5 |
| Conditional Diffusion Models with Classifier-Free Gibbs-like Guidance | May 27, 2025 | Audio GenerationDenoising | CodeCode Available | 0 | 5 |
| Challenge on Sound Scene Synthesis: Evaluating Text-to-Audio Generation | Oct 23, 2024 | Audio Generation | CodeCode Available | 0 | 5 |
| MelNet: A Generative Model for Audio in the Frequency Domain | Jun 4, 2019 | Audio GenerationMusic Generation | CodeCode Available | 0 | 5 |
| Music Source Separation in the Waveform Domain | Nov 27, 2019 | Audio GenerationAudio Synthesis | CodeCode Available | 0 | 5 |
| Conditional WaveGAN | Sep 27, 2018 | Audio Generation | CodeCode Available | 0 | 5 |
| Adversarial Generation of Time-Frequency Features with application in audio synthesis | Feb 11, 2019 | Audio GenerationAudio Synthesis | CodeCode Available | 0 | 5 |
| Stochastic Diffusion: A Diffusion Probabilistic Model for Stochastic Time Series Forecasting | Jun 5, 2024 | Audio GenerationTime Series | CodeCode Available | 0 | 5 |
| Audio inpainting of music by means of neural networks | Oct 29, 2018 | Audio GenerationAudio inpainting | CodeCode Available | 0 | 5 |
| MuseCoco: Generating Symbolic Music from Text | May 31, 2023 | AttributeAudio Generation | CodeCode Available | 0 | 5 |
| AudioGenX: Explainability on Text-to-Audio Generative Models | Feb 1, 2025 | Audio Generationcounterfactual | CodeCode Available | 0 | 5 |
| Assisted Sound Sample Generation with Musical Conditioning in Adversarial Auto-Encoders | Apr 12, 2019 | AttributeAudio Generation | CodeCode Available | 0 | 5 |
| SampleRNN: An Unconditional End-to-End Neural Audio Generation Model | Dec 22, 2016 | Audio GenerationSpeech Synthesis | CodeCode Available | 0 | 5 |
| XMAD-Bench: Cross-Domain Multilingual Audio Deepfake Benchmark | May 31, 2025 | Audio GenerationFace Swapping | CodeCode Available | 0 | 5 |
| GANSynth: Adversarial Neural Audio Synthesis | Feb 23, 2019 | Audio GenerationAudio Synthesis | CodeCode Available | 0 | 5 |
| Text Prompt is Not Enough: Sound Event Enhanced Prompt Adapter for Target Style Audio Generation | Sep 14, 2024 | Audio GenerationStyle Transfer | —Unverified | 0 | 0 |
| Text-to-Audio Generation Synchronized with Videos | Mar 8, 2024 | AudioCapsAudio Generation | —Unverified | 0 | 0 |
| The NPU-HWC System for the ISCSLP 2024 Inspirational and Convincing Audio Generation Challenge | Oct 31, 2024 | Audio GenerationLanguage Modeling | —Unverified | 0 | 0 |
| The Rarity of Musical Audio Signals Within the Space of Possible Audio Generation | May 23, 2024 | Audio Generation | —Unverified | 0 | 0 |
| tinyCLAP: Distilling Constrastive Language-Audio Pretrained Models | Nov 24, 2023 | Audio GenerationEvent Detection | —Unverified | 0 | 0 |
| Towards efficient quantum algorithms for diffusion probability models | Feb 20, 2025 | Audio Generation | —Unverified | 0 | 0 |
| Transferring neural speech waveform synthesizers to musical instrument sounds generation | Oct 27, 2019 | Audio GenerationAudio Synthesis | —Unverified | 0 | 0 |
| Tri-Ergon: Fine-grained Video-to-Audio Generation with Multi-modal Conditions and LUFS Control | Dec 29, 2024 | Audio GenerationAudio Synthesis | —Unverified | 0 | 0 |
| Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio | May 19, 2025 | Audio GenerationInformation Retrieval | —Unverified | 0 | 0 |
| UniForm: A Unified Multi-Task Diffusion Transformer for Audio-Video Generation | Feb 6, 2025 | Audio GenerationDiversity | —Unverified | 0 | 0 |
| (Un)paired signal-to-signal translation with 1D conditional GANs | Mar 5, 2024 | Audio GenerationGenerative Adversarial Network | —Unverified | 0 | 0 |
| Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound | Aug 21, 2024 | Audio GenerationAudio Synthesis | —Unverified | 0 | 0 |
| Video-Guided Foley Sound Generation with Multimodal Controls | Nov 26, 2024 | Audio Generation | —Unverified | 0 | 0 |
| Video-to-Audio Generation with Fine-grained Temporal Semantics | Sep 23, 2024 | Audio GenerationVideo Generation | —Unverified | 0 | 0 |
| Video-to-Audio Generation with Hidden Alignment | Jul 10, 2024 | Audio GenerationData Augmentation | —Unverified | 0 | 0 |
| VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation | Dec 14, 2024 | Audio Generation | —Unverified | 0 | 0 |
| ViSAGe: Video-to-Spatial Audio Generation | Jun 13, 2025 | Audio Generation | —Unverified | 0 | 0 |
| Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation | May 23, 2024 | Audio GenerationDenoising | —Unverified | 0 | 0 |
| Visually Informed Binaural Audio Generation without Binaural Audios | Apr 13, 2021 | Audio Generation | —Unverified | 0 | 0 |
| Voice command generation using Progressive Wavegans | Mar 13, 2019 | Audio GenerationImage Generation | —Unverified | 0 | 0 |
| VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis | Dec 26, 2024 | Audio GenerationSpeech Synthesis | —Unverified | 0 | 0 |
| Wasserstein Convergence of Score-based Generative Models under Semiconvexity and Discontinuous Gradients | May 6, 2025 | Audio GenerationDenoising | —Unverified | 0 | 0 |
| YingSound: Video-Guided Sound Effects Generation with Multi-modal Chain-of-Thought Controls | Dec 12, 2024 | Audio Generation | —Unverified | 0 | 0 |
| Multimodal Cinematic Video Synthesis Using Text-to-Image and Audio Generation Models | Apr 6, 2025 | Audio GenerationGPU | —Unverified | 0 | 0 |
| A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions | Nov 13, 2020 | Audio GenerationMusic Generation | —Unverified | 0 | 0 |
| Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos | Jun 13, 2024 | Audio GenerationRetrieval-augmented Generation | —Unverified | 0 | 0 |