SOTAVerified

Caption Generation

Papers

Showing 201250 of 310 papers

TitleStatusHype
Analysis of Convolutional Decoder for Image Caption Generation0
An encoder-decoder based framework for hindi image caption generation0
End-to-End Video Captioning0
A Thorough Review on Recent Deep Learning Methodologies for Image Captioning0
Attention-based transformer models for image captioning across languages: An in-depth survey and evaluation0
Automated Audio Captioning: An Overview of Recent Progress and New Challenges0
Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains0
BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving0
Bi-directional Contextual Attention for 3D Dense Captioning0
VidCoM: Fast Video Comprehension through Large Language Models with Multimodal Tools0
RealignDiff: Boosting Text-to-Image Diffusion Model with Coarse-to-fine Semantic Re-alignment0
Bringing back simplicity and lightliness into neural image captioning0
CapText: Large Language Model-based Caption Generation From Image Context and Description0
Caption Generation of Robot Behaviors based on Unsupervised Learning of Action Segments0
Chittron: An Automatic Bangla Image Captioning System0
Clue: Cross-modal Coherence Modeling for Caption Generation0
Common Subspace for Model and Similarity: Phrase Learning for Caption Generation From Images0
Controlled Caption Generation for Images Through Adversarial Attacks0
Cortico-cerebellar networks as decoupled neural interfaces0
CoVLA: Comprehensive Vision-Language-Action Dataset for Autonomous Driving0
Cross-Lingual Image Caption Generation0
Cross-modal Coherence Modeling for Caption Generation0
D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding0
DECap: Towards Generalized Explicit Caption Editing via Diffusion Mechanism0
Deep Bayesian Natural Language Processing0
Deep Learning Approaches on Image Captioning: A Review0
Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models0
Denoising Large-Scale Image Captioning from Alt-text Data using Content Selection Models0
Dense Video Captioning: A Survey of Techniques, Datasets and Evaluation Protocols0
Describing Multimedia Content using Attention-based Encoder--Decoder Networks0
Describing Natural Images Containing Novel Objects with Knowledge Guided Assitance0
Caption Generation on Scenes with Seen and Unseen Object Categories0
DiffCap: Exploring Continuous Diffusion on Image Captioning0
DIR: Retrieval-Augmented Image Captioning with Comprehensive Understanding0
Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space0
Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models?0
Do Large Multimodal Models Solve Caption Generation for Scientific Figures? Lessons Learned from SCICAP Challenge 20230
Domain Adaptation for Neural Networks by Parameter Augmentation0
DS@BioMed at ImageCLEFmedical Caption 2024: Enhanced Attention Mechanisms in Medical Caption Generation through Concept Detection Integration0
EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits0
Efficient Audio Captioning Transformer with Patchout and Text Guidance0
E-MMAD: Multimodal Advertising Caption Generation Based on Structured Information0
Empirical Analysis of Image Caption Generation using Deep Learning0
End to End Recognition System for Recognizing Offline Unconstrained Vietnamese Handwriting0
Enhancing Chest X-ray Classification through Knowledge Injection in Cross-Modality Learning0
Enhancing Image Caption Generation Using Reinforcement Learning with Human Feedback0
Enhancing Image Captioning with Neural Models0
Entity-aware Image Caption Generation0
Error Causal inference for Multi-Fusion models0
Evaluation of Automatic Video Captioning Using Direct Assessment0
Show:102550
← PrevPage 5 of 7Next →

No leaderboard results yet.