| ETTA: Elucidating the Design Space of Text-to-Audio Models | Dec 26, 2024 | AudioCapsAudio captioning | CodeCode Available | 2 |
| MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models | Jun 7, 2024 | FADText-to-Music Generation | CodeCode Available | 2 |
| Melody-Guided Music Generation | Sep 30, 2024 | cross-modal alignmentMusic Generation | CodeCode Available | 2 |
| MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation | Jul 21, 2024 | DiversityMusic Generation | CodeCode Available | 2 |
| Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning | Aug 22, 2023 | Caption GenerationLarge Language Model | CodeCode Available | 2 |
| Mustango: Toward Controllable Text-to-Music Generation | Nov 14, 2023 | Data AugmentationDenoising | CodeCode Available | 2 |
| PAM: Prompting Audio-Language Models for Audio Quality Assessment | Feb 1, 2024 | Audio Quality AssessmentMusic Generation | CodeCode Available | 2 |
| MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models | Feb 9, 2024 | Music GenerationText-to-Music Generation | CodeCode Available | 1 |
| MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies | Aug 3, 2023 | Audio GenerationBeat Tracking | CodeCode Available | 1 |
| Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music Generation Task | Nov 21, 2022 | Music GenerationText-to-Music Generation | CodeCode Available | 1 |