SOTAVerified

AudioCaps

Papers

Showing 1120 of 64 papers

TitleStatusHype
ADIFF: Explaining audio difference using natural languageCode1
LAVCap: LLM-based Audio-Visual Captioning using Optimal TransportCode1
Language-based Audio Retrieval with Co-Attention Networks0
ETTA: Elucidating the Design Space of Text-to-Audio ModelsCode2
Enhancing Retrieval-Augmented Audio Captioning with Generation-Assisted Multimodal Querying and Progressive Learning0
SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMsCode0
DiffATR: Diffusion-based Generative Modeling for Audio-Text Retrieval0
EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning PerformanceCode2
Dissecting Temporal Understanding in Text-to-Audio Retrieval0
Estimated Audio-Caption Correspondences Improve Language-Based Audio RetrievalCode0
Show:102550
← PrevPage 2 of 7Next →

No leaderboard results yet.