SOTAVerified

Caption Generation

Papers

Showing 2650 of 310 papers

TitleStatusHype
Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance SegmentationCode1
Improving Image Captioning with Better Use of CaptionsCode1
Large-scale Pre-training for Grounded Video Caption GenerationCode1
Spatiality-guided Transformer for 3D Dense Captioning on Point CloudsCode1
SwinBERT: End-to-End Transformers with Sparse Attention for Video CaptioningCode1
NeuSyRE: Neuro-Symbolic Visual Understanding and Reasoning Framework based on Scene Graph EnrichmentCode1
HCQA @ Ego4D EgoSchema Challenge 2024Code1
GL-RG: Global-Local Representation Granularity for Video CaptioningCode1
Grad-CAM++: Improved Visual Explanations for Deep Convolutional NetworksCode1
Human-like Controllable Image Captioning with Verb-specific Semantic RolesCode1
End-to-End Dense Video Captioning with Parallel DecodingCode1
Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change CaptioningCode1
COSMic: A Coherence-Aware Generation Metric for Image DescriptionsCode1
Croc: Pretraining Large Multimodal Models with Cross-Modal ComprehensionCode1
Injecting Semantic Concepts into End-to-End Image CaptioningCode1
EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive PruningCode1
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual ConceptsCode1
Connecting What to Say With Where to Look by Modeling Human Attention TracesCode1
Deep Reinforcement Learning For Sequence to Sequence ModelsCode1
Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption GenerationCode1
MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based AnnotationsCode1
BCAmirs at SemEval-2024 Task 4: Beyond Words: A Multimodal and Multilingual Exploration of Persuasion in MemesCode1
Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer NetworkCode1
Belief Revision based Caption Re-ranker with Visual Semantic InformationCode1
Team RUC_AIM3 Technical Report at ActivityNet 2021: Entities Object LocalizationCode1
Show:102550
← PrevPage 2 of 13Next →

No leaderboard results yet.