SOTAVerified

AudioCaps

Papers

Showing 3140 of 64 papers

TitleStatusHype
Audiobox: Unified Audio Generation with Natural Language Prompts0
Audio-Visual LLM for Video Understanding0
FLAP: Fast Language-Audio Pre-training0
Generation or Replication: Auscultating Audio Latent Diffusion Models0
VoiceLDM: Text-to-Speech with Environmental Context0
Weakly-supervised Automated Audio Captioning via text only trainingCode0
ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency DistillationCode1
RECAP: Retrieval-Augmented Audio CaptioningCode1
Retrieval-Augmented Text-to-Audio Generation0
Killing two birds with one stone: Can an audio captioning system also be used for audio-text retrieval?0
Show:102550
← PrevPage 4 of 7Next →

No leaderboard results yet.