| VoiceLDM: Text-to-Speech with Environmental Context | Sep 24, 2023 | AudioCapstext-to-speech | —Unverified | 0 |
| Weakly-supervised Automated Audio Captioning via text only training | Sep 21, 2023 | AudioCapsAudio captioning | CodeCode Available | 0 |
| Retrieval-Augmented Text-to-Audio Generation | Sep 14, 2023 | AudioCapsAudio Generation | —Unverified | 0 |
| Killing two birds with one stone: Can an audio captioning system also be used for audio-text retrieval? | Aug 29, 2023 | AudioCapsAudio captioning | —Unverified | 0 |
| Rethinking Transfer and Auxiliary Learning for Improving Audio Captioning Transformer | Aug 20, 2023 | AudioCapsAudio captioning | —Unverified | 0 |
| DiffAVA: Personalized Text-to-Audio Generation with Visual Alignment | May 22, 2023 | AudioCapsAudio Generation | —Unverified | 0 |
| Accommodating Audio Modality in CLIP for Multimodal Processing | Mar 12, 2023 | AudioCapsContrastive Learning | CodeCode Available | 0 |
| Automated Audio Captioning via Fusion of Low- and High- Dimensional Features | Oct 10, 2022 | AudioCapsAudio captioning | —Unverified | 0 |
| Audio-text Retrieval in Context | Mar 25, 2022 | AudioCapsRetrieval | —Unverified | 0 |
| Leveraging Pre-trained BERT for Audio Captioning | Mar 6, 2022 | AudioCapsAudio captioning | —Unverified | 0 |
| Joint Speech Recognition and Audio Captioning | Feb 3, 2022 | AudioCapsAudio captioning | —Unverified | 0 |
| AUTOMATED AUDIO CAPTIONING BY FINE-TUNING BART WITH AUDIOSET TAGS | Nov 15, 2021 | AudioCapsAudio captioning | CodeCode Available | 0 |
| Audio Captioning with Composition of Acoustic and Semantic Information | May 13, 2021 | AudioCapsAudio captioning | —Unverified | 0 |
| AudioCaps: Generating Captions for Audios in The Wild | Jun 1, 2019 | AudioCapsAudio captioning | —Unverified | 0 |