M2D2: Exploring General-purpose Audio-Language Representations Beyond CLAP Mar 28, 2025 Audio captioning Audio Classification
Code Code Available 05 DRCap: Decoding CLAP Latents with Retrieval-Augmented Generation for Zero-shot Audio Captioning Oct 12, 2024 Audio captioning Large Language Model
Code Code Available 05 Multi-task Regularization Based on Infrequent Classes for Audio Captioning Jul 9, 2020 Audio captioning Decoder
Code Code Available 05 OpenSep: Leveraging Large Language Models with Textual Inversion for Open World Audio Separation Sep 28, 2024 Audio captioning
Code Code Available 05 AUTOMATED AUDIO CAPTIONING BY FINE-TUNING BART WITH AUDIOSET TAGS Nov 15, 2021 AudioCaps Audio captioning
Code Code Available 05 SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs Oct 12, 2024 AudioCaps Audio captioning
Code Code Available 05 Solla: Towards a Speech-Oriented LLM That Hears Acoustic Context Mar 19, 2025 Audio captioning Audio Question Answering
Code Code Available 05 Temporal Sub-sampling of Audio Feature Sequences for Automated Audio Captioning Jul 6, 2020 Audio captioning
Code Code Available 05 Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement Aug 23, 2023 Audio captioning Disentanglement
Code Code Available 05 An Eye for an Ear: Zero-shot Audio Description Leveraging an Image Captioner using Audiovisual Distribution Alignment Oct 8, 2024 Audio captioning Contrastive Learning
Code Code Available 05 Weakly-supervised Automated Audio Captioning via text only training Sep 21, 2023 AudioCaps Audio captioning
Code Code Available 05 Audio Caption in a Car Setting with a Sentence-Level Loss May 31, 2019 Audio captioning Decoder
Code Code Available 05 Language-based Audio Retrieval Task in DCASE 2022 Challenge Sep 20, 2022 Audio captioning Retrieval
— Unverified 00 AC/DC: LLM-based Audio Comprehension via Dialogue Continuation Jun 12, 2025 AudioCaps Audio captioning
— Unverified 00 Auto-ACD: A Large-scale Dataset for Audio-Language Representation Learning Sep 20, 2023 Audio captioning Caption Generation
— Unverified 00 An Attempt towards Interpretable Audio-Visual Video Captioning Dec 7, 2018 Audio captioning Audio-Visual Video Captioning
— Unverified 00 An investigation on selecting audio pre-trained models for audio captioning Aug 12, 2022 Audio captioning
— Unverified 00 A Transformer-based Audio Captioning Model with Keyword Estimation Jul 1, 2020 Acoustic Scene Classification Audio captioning
— Unverified 00 AudioCaps: Generating Captions for Audios in The Wild Jun 1, 2019 AudioCaps Audio captioning
— Unverified 00 Audio Captioning using Gated Recurrent Units Jun 5, 2020 Audio captioning
— Unverified 00 Audio Captioning using Pre-Trained Large-Scale Language Model Guided by Audio-based Similar Caption Retrieval Dec 14, 2020 Audio captioning Language Modeling
— Unverified 00 Enhancing Retrieval-Augmented Audio Captioning with Generation-Assisted Multimodal Querying and Progressive Learning Oct 14, 2024 AudioCaps Audio captioning
— Unverified 00 Audio Captioning with Composition of Acoustic and Semantic Information May 13, 2021 AudioCaps Audio captioning
— Unverified 00 Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model Jan 13, 2025 Audio captioning Instruction Following
— Unverified 00 Audio Dialogues: Dialogues dataset for audio and music understanding Apr 11, 2024 Audio captioning Audio Question Answering
— Unverified 00 Audio Difference Learning for Audio Captioning Sep 15, 2023 Audio captioning
— Unverified 00 Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities Mar 6, 2025 Audio captioning Language Modeling
— Unverified 00 Automated Audio Captioning: An Overview of Recent Progress and New Challenges May 12, 2022 Audio captioning Caption Generation
— Unverified 00 Automated Audio Captioning using Transfer Learning and Reconstruction Latent Space Similarity Regularization Aug 10, 2021 Audio captioning Decoder
— Unverified 00 Automated Audio Captioning via Fusion of Low- and High- Dimensional Features Oct 10, 2022 AudioCaps Audio captioning
— Unverified 00 Automated Audio Captioning with Epochal Difficult Captions for Curriculum Learning Jun 4, 2022 Audio captioning
— Unverified 00 Automated Audio Captioning with Recurrent Neural Networks Jun 30, 2017 Audio captioning Decoder
— Unverified 00 Automatic Audio Captioning using Attention weighted Event based Embeddings Jan 28, 2022 Audio captioning Decoder
— Unverified 00 CLAP-ART: Automated Audio Captioning with Semantic-rich Audio Representation Tokenizer Jun 1, 2025 Audio captioning Language Modeling
— Unverified 00 Classifier-Guided Captioning Across Modalities Jan 3, 2025 Audio captioning Video Captioning
— Unverified 00 CosyAudio: Improving Audio Generation with Confidence Scores and Synthetic Captions Jan 28, 2025 Audio captioning Audio Generation
— Unverified 00 Diverse Audio Captioning via Adversarial Training Oct 13, 2021 Audio captioning Diversity
— Unverified 00 Diversity and bias in audio captioning datasets Nov 15, 2022 Audio captioning Diversity
— Unverified 00 Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning May 30, 2023 Audio captioning Decoder
— Unverified 00 Effects of Word-frequency based Pre- and Post- Processings for Audio Captioning Sep 24, 2020 Audio captioning Data Augmentation
— Unverified 00 Efficient Audio Captioning Transformer with Patchout and Text Guidance Apr 6, 2023 Audio captioning Caption Generation
— Unverified 00 EmotionCaps: Enhancing Audio Captioning Through Emotion-Augmented Data Generation Oct 15, 2024 Audio captioning Emotion Recognition
— Unverified 00 Enhancing Low-Resource Language and Instruction Following Capabilities of Audio Language Models Sep 17, 2024 Audio captioning Instruction Following
— Unverified 00 Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization Oct 9, 2024 Audio captioning Large Language Model
— Unverified 00 Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders Feb 21, 2025 Audio captioning Automatic Speech Recognition
— Unverified 00 Enhancing Temporal Understanding in Audio Question Answering for Large Audio Language Models Sep 10, 2024 Audio captioning Audio Question Answering
— Unverified 00 Evaluating Off-the-Shelf Machine Listening and Natural Language Models for Automated Audio Captioning Oct 14, 2021 Audio captioning Word Embeddings
— Unverified 00 Expanding on EnCLAP with Auxiliary Retrieval Model for Automated Audio Captioning Sep 2, 2024 Audio captioning Reranking
— Unverified 00 Generating Realistic Images from In-the-wild Sounds Sep 5, 2023 Audio captioning Sentence
— Unverified 00 Impact of visual assistance for automated audio captioning Nov 18, 2022 Audio captioning Event Detection
— Unverified 00