SOTAVerified

Caption Generation

Papers

Showing 101110 of 310 papers

TitleStatusHype
MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query ResponseCode1
Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense CaptioningCode1
Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and CaptioningCode2
ViCo: Engaging Video Comment Generation with Human Preference Rewards0
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative InstructionsCode2
Transferable Decoding with Visual Entities for Zero-Shot Image CaptioningCode1
FigCaps-HF: A Figure-to-Caption Generative Framework and Benchmark with Human FeedbackCode0
AIC-AB NET: A Neural Network for Image Captioning with Spatial Attention and Text Attributes0
Multi-Similarity Contrastive Learning0
Knowledge Distillation for Efficient Audio-Visual Video Captioning0
Show:102550
← PrevPage 11 of 31Next →

No leaderboard results yet.