| Analysis of Convolutional Decoder for Image Caption Generation | Mar 8, 2021 | Caption GenerationData Augmentation | —Unverified | 0 | 0 |
| An encoder-decoder based framework for hindi image caption generation | Jul 9, 2021 | Caption GenerationDecoder | —Unverified | 0 | 0 |
| End-to-End Video Captioning | Apr 4, 2019 | Action RecognitionCaption Generation | —Unverified | 0 | 0 |
| A Thorough Review on Recent Deep Learning Methodologies for Image Captioning | Jul 28, 2021 | Caption GenerationDescriptive | —Unverified | 0 | 0 |
| Attention-based transformer models for image captioning across languages: An in-depth survey and evaluation | Jun 3, 2025 | Caption GenerationImage Captioning | —Unverified | 0 | 0 |
| Automated Audio Captioning: An Overview of Recent Progress and New Challenges | May 12, 2022 | Audio captioningCaption Generation | —Unverified | 0 | 0 |
| Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains | Nov 22, 2024 | BenchmarkingCaption Generation | —Unverified | 0 | 0 |
| BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving | Jan 2, 2024 | Autonomous DrivingCaption Generation | —Unverified | 0 | 0 |
| Bi-directional Contextual Attention for 3D Dense Captioning | Aug 13, 2024 | 3D dense captioningAttribute | —Unverified | 0 | 0 |
| VidCoM: Fast Video Comprehension through Large Language Models with Multimodal Tools | Oct 16, 2023 | Caption GenerationDescriptive | —Unverified | 0 | 0 |
| RealignDiff: Boosting Text-to-Image Diffusion Model with Coarse-to-fine Semantic Re-alignment | May 31, 2023 | Caption GenerationLanguage Modelling | —Unverified | 0 | 0 |
| Bringing back simplicity and lightliness into neural image captioning | Oct 15, 2018 | Caption GenerationImage Captioning | —Unverified | 0 | 0 |
| CapText: Large Language Model-based Caption Generation From Image Context and Description | Jun 1, 2023 | Caption GenerationImage to text | —Unverified | 0 | 0 |
| Caption Generation of Robot Behaviors based on Unsupervised Learning of Action Segments | Mar 23, 2020 | Caption GenerationChunking | —Unverified | 0 | 0 |
| Chittron: An Automatic Bangla Image Captioning System | Sep 2, 2018 | Caption GenerationImage Captioning | —Unverified | 0 | 0 |
| Clue: Cross-modal Coherence Modeling for Caption Generation | May 2, 2020 | Caption Generationcontrollable image captioning | —Unverified | 0 | 0 |
| Common Subspace for Model and Similarity: Phrase Learning for Caption Generation From Images | Dec 1, 2015 | Caption GenerationDescriptive | —Unverified | 0 | 0 |
| Controlled Caption Generation for Images Through Adversarial Attacks | Jul 7, 2021 | Caption GenerationImage Captioning | —Unverified | 0 | 0 |
| Cortico-cerebellar networks as decoupled neural interfaces | Jan 1, 2021 | Caption Generation | —Unverified | 0 | 0 |
| CoVLA: Comprehensive Vision-Language-Action Dataset for Autonomous Driving | Aug 19, 2024 | Autonomous DrivingCaption Generation | —Unverified | 0 | 0 |
| Cross-Lingual Image Caption Generation | Aug 1, 2016 | Caption GenerationDependency Parsing | —Unverified | 0 | 0 |
| Cross-modal Coherence Modeling for Caption Generation | Jul 1, 2020 | Caption Generationcontrollable image captioning | —Unverified | 0 | 0 |
| D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding | Dec 2, 2021 | 3D dense captioning3D visual grounding | —Unverified | 0 | 0 |
| DECap: Towards Generalized Explicit Caption Editing via Diffusion Mechanism | Nov 25, 2023 | Caption GenerationDenoising | —Unverified | 0 | 0 |
| Deep Bayesian Natural Language Processing | Jul 1, 2019 | Caption GenerationClustering | —Unverified | 0 | 0 |
| Deep Learning Approaches on Image Captioning: A Review | Jan 31, 2022 | Caption GenerationDeep Learning | —Unverified | 0 | 0 |
| Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models | Nov 18, 2019 | Anomaly DetectionAutonomous Driving | —Unverified | 0 | 0 |
| Denoising Large-Scale Image Captioning from Alt-text Data using Content Selection Models | Sep 17, 2021 | Caption GenerationDenoising | —Unverified | 0 | 0 |
| Dense Video Captioning: A Survey of Techniques, Datasets and Evaluation Protocols | Nov 5, 2023 | Caption GenerationDense Video Captioning | —Unverified | 0 | 0 |
| Describing Multimedia Content using Attention-based Encoder--Decoder Networks | Jul 4, 2015 | Caption GenerationDecoder | —Unverified | 0 | 0 |
| Describing Natural Images Containing Novel Objects with Knowledge Guided Assitance | Oct 17, 2017 | Caption Generation | —Unverified | 0 | 0 |
| Caption Generation on Scenes with Seen and Unseen Object Categories | Aug 13, 2021 | Caption GenerationLanguage Modelling | —Unverified | 0 | 0 |
| DiffCap: Exploring Continuous Diffusion on Image Captioning | May 20, 2023 | Caption GenerationDiversity | —Unverified | 0 | 0 |
| DIR: Retrieval-Augmented Image Captioning with Comprehensive Understanding | Dec 2, 2024 | Caption GenerationDomain Generalization | —Unverified | 0 | 0 |
| Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space | Nov 19, 2017 | Caption GenerationImage Description | —Unverified | 0 | 0 |
| Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models? | Jun 20, 2024 | Caption GenerationHallucination | —Unverified | 0 | 0 |
| Do Large Multimodal Models Solve Caption Generation for Scientific Figures? Lessons Learned from SCICAP Challenge 2023 | Jan 31, 2025 | ArticlesCaption Generation | —Unverified | 0 | 0 |
| Domain Adaptation for Neural Networks by Parameter Augmentation | Jul 1, 2016 | Caption GenerationDomain Adaptation | —Unverified | 0 | 0 |
| DS@BioMed at ImageCLEFmedical Caption 2024: Enhanced Attention Mechanisms in Medical Caption Generation through Concept Detection Integration | Jun 1, 2024 | Caption GenerationImage Captioning | —Unverified | 0 | 0 |
| EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits | Jun 11, 2025 | Artifact DetectionCaption Generation | —Unverified | 0 | 0 |
| Efficient Audio Captioning Transformer with Patchout and Text Guidance | Apr 6, 2023 | Audio captioningCaption Generation | —Unverified | 0 | 0 |
| E-MMAD: Multimodal Advertising Caption Generation Based on Structured Information | Nov 16, 2021 | Caption Generationvalid | —Unverified | 0 | 0 |
| Empirical Analysis of Image Caption Generation using Deep Learning | May 14, 2021 | Caption GenerationDecoder | —Unverified | 0 | 0 |
| End to End Recognition System for Recognizing Offline Unconstrained Vietnamese Handwriting | May 14, 2019 | Caption GenerationDecoder | —Unverified | 0 | 0 |
| Enhancing Chest X-ray Classification through Knowledge Injection in Cross-Modality Learning | Feb 19, 2025 | Caption GenerationClassification | —Unverified | 0 | 0 |
| Enhancing Image Caption Generation Using Reinforcement Learning with Human Feedback | Mar 11, 2024 | Caption Generationreinforcement-learning | —Unverified | 0 | 0 |
| Enhancing Image Captioning with Neural Models | Dec 1, 2023 | Caption GenerationImage Captioning | —Unverified | 0 | 0 |
| Entity-aware Image Caption Generation | Apr 21, 2018 | Caption GenerationImage Captioning | —Unverified | 0 | 0 |
| Error Causal inference for Multi-Fusion models | Jun 1, 2021 | Caption GenerationCausal Inference | —Unverified | 0 | 0 |
| Evaluation of Automatic Video Captioning Using Direct Assessment | Oct 29, 2017 | Caption GenerationMachine Translation | —Unverified | 0 | 0 |