SOTAVerified

Caption Generation

Papers

Showing 126150 of 310 papers

TitleStatusHype
GNN-ViTCap: GNN-Enhanced Multiple Instance Learning with Vision Transformers for Whole Slide Image Classification and Captioning0
LaPIG: Cross-Modal Generation of Paired Thermal and Visible Facial Images0
Automated Audio Captioning: An Overview of Recent Progress and New Challenges0
Knowledge driven Description Synthesis for Floor Plan Interpretation0
Efficient Audio Captioning Transformer with Patchout and Text Guidance0
EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits0
Common Subspace for Model and Similarity: Phrase Learning for Caption Generation From Images0
Language Production Dynamics with Recurrent Neural Networks0
LoHoRavens: A Long-Horizon Language-Conditioned Benchmark for Robotic Tabletop Manipulation0
Clue: Cross-modal Coherence Modeling for Caption Generation0
DS@BioMed at ImageCLEFmedical Caption 2024: Enhanced Attention Mechanisms in Medical Caption Generation through Concept Detection Integration0
Domain Adaptation for Neural Networks by Parameter Augmentation0
Do Large Multimodal Models Solve Caption Generation for Scientific Figures? Lessons Learned from SCICAP Challenge 20230
Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models?0
Image Captioning using Facial Expression and Attention0
Attention-based transformer models for image captioning across languages: An in-depth survey and evaluation0
Image Caption Generation Framework for Assamese News using Attention Mechanism0
Auto-ACD: A Large-scale Dataset for Audio-Language Representation Learning0
Image Caption Generation for Low-Resource Assamese Language0
IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers0
Chittron: An Automatic Bangla Image Captioning System0
Image to Bengali Caption Generation Using Deep CNN and Bidirectional Gated Recurrent Unit0
Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space0
Image Captioning with Integrated Bottom-Up and Multi-level Residual Top-Down Attention for Game Scene Understanding0
Improving Image Captioning with Better Use of Caption0
Show:102550
← PrevPage 6 of 13Next →

No leaderboard results yet.