| MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners | Jun 23, 2025 | AttributeAudio inpainting | —Unverified | 0 |
| Diff-TONE: Timestep Optimization for iNstrument Editing in Text-to-Music Diffusion Models | Jun 18, 2025 | Music GenerationText-to-Music Generation | —Unverified | 0 |
| Auto-Regressive vs Flow-Matching: a Comparative Study of Modeling Paradigms for Text-to-Music Generation | Jun 10, 2025 | Audio inpaintingMusic Generation | —Unverified | 0 |
| TokenSynth: A Token-based Neural Synthesizer for Instrument Cloning and Text-to-Instrument | Feb 13, 2025 | Audio GenerationDecoder | CodeCode Available | 2 |
| Diffusion based Text-to-Music Generation with Global and Local Text based Conditioning | Jan 24, 2025 | FADLanguage Modeling | —Unverified | 0 |
| ETTA: Elucidating the Design Space of Text-to-Audio Models | Dec 26, 2024 | AudioCapsAudio captioning | CodeCode Available | 2 |
| Long-Form Text-to-Music Generation with Adaptive Prompts: A Case Study in Tabletop Role-Playing Games Soundtracks | Nov 6, 2024 | FormMusic Generation | CodeCode Available | 0 |
| MusicFlow: Cascaded Flow Matching for Text Guided Music Generation | Oct 27, 2024 | Music GenerationText-to-Music Generation | —Unverified | 0 |
| Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer | Oct 7, 2024 | Music GenerationMusic Style Transfer | —Unverified | 0 |
| Melody-Guided Music Generation | Sep 30, 2024 | cross-modal alignmentMusic Generation | CodeCode Available | 2 |
| FLUX that Plays Music | Sep 1, 2024 | Music GenerationText-to-Music Generation | CodeCode Available | 13 |
| Combining audio control and style transfer using latent diffusion | Jul 31, 2024 | DisentanglementMusic Generation | —Unverified | 0 |
| MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation | Jul 21, 2024 | DiversityMusic Generation | CodeCode Available | 2 |
| Stable Audio Open | Jul 19, 2024 | Audio GenerationText-to-Music Generation | CodeCode Available | 7 |
| The Interpretation Gap in Text-to-Music Generation Models | Jul 14, 2024 | Information RetrievalMusic Generation | —Unverified | 0 |
| Improving Text-To-Audio Models with Synthetic Captions | Jun 18, 2024 | AudioCapsAudio captioning | CodeCode Available | 5 |
| JEN-1 DreamStyler: Customized Musical Concept Learning via Pivotal Parameters Tuning | Jun 18, 2024 | Music GenerationText-to-Music Generation | —Unverified | 0 |
| MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models | Jun 7, 2024 | FADText-to-Music Generation | CodeCode Available | 2 |
| Quality-aware Masked Diffusion Transformer for Enhanced Music Generation | May 24, 2024 | DiversityMusic Generation | CodeCode Available | 4 |
| MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models | Feb 9, 2024 | Music GenerationText-to-Music Generation | CodeCode Available | 1 |
| Fast Timing-Conditioned Latent Audio Diffusion | Feb 7, 2024 | Audio GenerationGPU | CodeCode Available | 7 |
| PAM: Prompting Audio-Language Models for Audio Quality Assessment | Feb 1, 2024 | Audio Quality AssessmentMusic Generation | CodeCode Available | 2 |
| The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation | Nov 16, 2023 | Music CaptioningMusic Generation | CodeCode Available | 1 |
| Mustango: Toward Controllable Text-to-Music Generation | Nov 14, 2023 | Data AugmentationDenoising | CodeCode Available | 2 |
| Music ControlNet: A model similar to SD ControlNetD that can accurately control music generation | Nov 7, 2023 | Music GenerationRhythm | CodeCode Available | 1 |
| Investigating Personalization Methods in Text to Music Generation | Sep 20, 2023 | Data AugmentationMusic Generation | CodeCode Available | 1 |
| Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning | Aug 22, 2023 | Caption GenerationLarge Language Model | CodeCode Available | 2 |
| AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining | Aug 10, 2023 | Audio GenerationIn-Context Learning | CodeCode Available | 4 |
| JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models | Aug 9, 2023 | Computational EfficiencyIn-Context Learning | CodeCode Available | 1 |
| MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies | Aug 3, 2023 | Audio GenerationBeat Tracking | CodeCode Available | 1 |
| Simple and Controllable Music Generation | Jun 8, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 6 |
| Efficient Neural Music Generation | May 25, 2023 | DenoisingMusic Generation | —Unverified | 0 |
| ERNIE-Music: Text-to-Waveform Music Generation with Diffusion Models | Feb 9, 2023 | DiversityMusic Generation | —Unverified | 0 |
| Noise2Music: Text-conditioned Music Generation with Diffusion Models | Feb 8, 2023 | Music GenerationText-to-Music Generation | —Unverified | 0 |
| Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion | Jan 27, 2023 | GPUImage Generation | CodeCode Available | 4 |
| MusicLM: Generating Music From Text | Jan 26, 2023 | Music GenerationText-to-Music Generation | CodeCode Available | 6 |
| Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music Generation Task | Nov 21, 2022 | Music GenerationText-to-Music Generation | CodeCode Available | 1 |