| MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis | Dec 19, 2024 | Audio GenerationAudio Synthesis | CodeCode Available | 7 |
| Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching | Jun 1, 2024 | Audio GenerationVideo-to-Sound Generation | CodeCode Available | 2 |
| Tell What You Hear From What You See -- Video to Audio Generation Through Text | Nov 8, 2024 | Audio captioningAudio Generation | CodeCode Available | 1 |
| Temporally Aligned Audio for Video with Autoregression | Sep 20, 2024 | Audio GenerationVideo-to-Sound Generation | CodeCode Available | 1 |
| Read, Watch and Scream! Sound Generation from Text and Video | Jul 8, 2024 | Audio GenerationTriplet | CodeCode Available | 1 |
| V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models | Aug 18, 2023 | Audio GenerationVideo-to-Sound Generation | CodeCode Available | 1 |
| Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound | Aug 21, 2024 | Audio GenerationAudio Synthesis | —Unverified | 0 |
| Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity | Jul 15, 2024 | Video-to-Sound Generation | —Unverified | 0 |
| VarietySound: Timbre-Controllable Video to Sound Generation via Unsupervised Information Disentanglement | Nov 19, 2022 | DisentanglementVideo-to-Sound Generation | —Unverified | 0 |