| Transform, Contrast and Tell: Coherent Entity-Aware Multi-Image Captioning | Feb 4, 2023 | Caption GenerationCoherence Evaluation | CodeCode Available | 0 |
| Uncertainty-Aware Image Captioning | Nov 30, 2022 | Caption GenerationImage Captioning | —Unverified | 0 |
| Retrieval-Augmented Multimodal Language Modeling | Nov 22, 2022 | Caption GenerationImage Captioning | —Unverified | 0 |
| Event and Entity Extraction from Generated Video Captions | Nov 5, 2022 | Caption GenerationDense Video Captioning | CodeCode Available | 0 |
| Image Caption Generation for Low-Resource Assamese Language | Nov 1, 2022 | Caption GenerationDecoder | —Unverified | 0 |
| Generating image captions with external encyclopedic knowledge | Oct 10, 2022 | Caption GenerationImage Captioning | —Unverified | 0 |
| REST: REtrieve & Self-Train for generative action recognition | Sep 29, 2022 | Action RecognitionCaption Generation | —Unverified | 0 |
| Medical Image Captioning via Generative Pretrained Transformers | Sep 28, 2022 | Caption GenerationDescriptive | —Unverified | 0 |
| Word to Sentence Visual Semantic Similarity for Caption Generation: Lessons Learned | Sep 26, 2022 | Caption GenerationSemantic Similarity | —Unverified | 0 |
| Multilingual Image Corpus – Towards a Multimodal and Multilingual Dataset | Jun 1, 2022 | Caption Generationimage-classification | —Unverified | 0 |
| Aligning Images and Text with Semantic Role Labels for Fine-Grained Cross-Modal Understanding | Jun 1, 2022 | Caption GenerationImage Retrieval | —Unverified | 0 |
| Examining the Effects of Language-and-Vision Data Augmentation for Generation of Descriptions of Human Faces | Jun 1, 2022 | Caption GenerationData Augmentation | —Unverified | 0 |
| Automated Audio Captioning: An Overview of Recent Progress and New Challenges | May 12, 2022 | Audio captioningCaption Generation | —Unverified | 0 |
| Guiding Attention using Partial-Order Relationships for Image Captioning | Apr 15, 2022 | Caption GenerationImage Captioning | —Unverified | 0 |
| NICGSlowDown: Evaluating the Efficiency Robustness of Neural Image Caption Generation Models | Mar 29, 2022 | Caption Generation | CodeCode Available | 0 |
| NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge | Mar 28, 2022 | Caption GenerationObject | —Unverified | 0 |
| A Deep Neural Framework for Image Caption Generation Using GRU-Based Attention Mechanism | Mar 3, 2022 | Caption GenerationDecoder | —Unverified | 0 |
| Deep Learning Approaches on Image Captioning: A Review | Jan 31, 2022 | Caption GenerationDeep Learning | —Unverified | 0 |
| Local Information Assisted Attention-free Decoder for Audio Captioning | Jan 10, 2022 | Audio captioningCaption Generation | CodeCode Available | 0 |
| MAGIC: Multimodal relAtional Graph adversarIal inferenCe for Diverse and Unpaired Text-based Image Captioning | Dec 13, 2021 | Caption GenerationDescriptive | —Unverified | 0 |
| D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding | Dec 2, 2021 | 3D dense captioning3D visual grounding | —Unverified | 0 |
| Image Caption Generation Framework for Assamese News using Attention Mechanism | Dec 1, 2021 | Caption GenerationDecoder | —Unverified | 0 |
| Multi-modal Dependency Tree for Video Captioning | Dec 1, 2021 | Caption GenerationDependency Parsing | —Unverified | 0 |
| CLIP Meets Video Captioning: Concept-Aware Representation Learning Does Matter | Nov 30, 2021 | Caption GenerationRepresentation Learning | CodeCode Available | 0 |
| E-MMAD: Multimodal Advertising Caption Generation Based on Structured Information | Nov 16, 2021 | Caption Generationvalid | —Unverified | 0 |
| Temporal Knowledge-Aware Image Captioning | Nov 16, 2021 | Caption GenerationImage Captioning | —Unverified | 0 |
| AUTOMATED AUDIO CAPTIONING BY FINE-TUNING BART WITH AUDIOSET TAGS | Nov 15, 2021 | AudioCapsAudio captioning | CodeCode Available | 0 |
| Rˆ3Net:Relation-embedded Representation Reconstruction Network for Change Captioning | Nov 1, 2021 | Caption GenerationRelation | CodeCode Available | 0 |
| Bangla Image Caption Generation through CNN-Transformer based Encoder-Decoder Network | Oct 24, 2021 | Caption GenerationDecoder | CodeCode Available | 0 |
| Cortico-cerebellar networks as decoupling neural interfaces | Oct 21, 2021 | Caption Generation | CodeCode Available | 0 |
| R^3Net:Relation-embedded Representation Reconstruction Network for Change Captioning | Oct 20, 2021 | Caption GenerationRelation | CodeCode Available | 0 |
| Geometry-Entangled Visual Semantic Transformer for Image Captioning | Sep 29, 2021 | Caption GenerationImage Captioning | —Unverified | 0 |
| Scene Graph Generation for Better Image Captioning? | Sep 23, 2021 | Caption GenerationGraph Generation | —Unverified | 0 |
| Denoising Large-Scale Image Captioning from Alt-text Data using Content Selection Models | Sep 17, 2021 | Caption GenerationDenoising | —Unverified | 0 |
| Journalistic Guidelines Aware News Image Captioning | Sep 7, 2021 | Caption GenerationDescriptive | CodeCode Available | 0 |
| LAViTeR: Learning Aligned Visual and Textual Representations Assisted by Image and Caption Generation | Sep 4, 2021 | Caption GenerationImage Captioning | CodeCode Available | 0 |
| Goal-driven text descriptions for images | Aug 28, 2021 | AI AgentCaption Generation | —Unverified | 0 |
| Table Caption Generation in Scholarly Documents Leveraging Pre-trained Language Models | Aug 18, 2021 | Caption Generation | CodeCode Available | 0 |
| Caption Generation on Scenes with Seen and Unseen Object Categories | Aug 13, 2021 | Caption GenerationLanguage Modelling | —Unverified | 0 |
| O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable Video Captioning | Aug 5, 2021 | AttributeCaption Generation | —Unverified | 0 |
| NLPHut’s Participation at WAT2021 | Aug 1, 2021 | Caption GenerationImage Captioning | —Unverified | 0 |
| A Thorough Review on Recent Deep Learning Methodologies for Image Captioning | Jul 28, 2021 | Caption GenerationDescriptive | —Unverified | 0 |
| Global Object Proposals for Improving Multi-Sentence Video Descriptions | Jul 18, 2021 | Caption GenerationDense Video Captioning | CodeCode Available | 0 |
| An encoder-decoder based framework for hindi image caption generation | Jul 9, 2021 | Caption GenerationDecoder | —Unverified | 0 |
| Controlled Caption Generation for Images Through Adversarial Attacks | Jul 7, 2021 | Caption GenerationImage Captioning | —Unverified | 0 |
| THE DCASE 2021 CHALLENGE TASK 6 SYSTEM: AUTOMATED AUDIO CAPTIONING WITH WEAKLY SUPERVISED PRE-TRAING AND WORD SELECTION METHODS | Jul 6, 2021 | Audio captioningCaption Generation | —Unverified | 0 |
| Error Causal inference for Multi-Fusion models | Jun 1, 2021 | Caption GenerationCausal Inference | —Unverified | 0 |
| Weakly Supervised Dense Video Captioning via Jointly Usage of Knowledge Distillation and Cross-modal Matching | May 18, 2021 | Caption GenerationCross-Modal Retrieval | —Unverified | 0 |
| Empirical Analysis of Image Caption Generation using Deep Learning | May 14, 2021 | Caption GenerationDecoder | —Unverified | 0 |
| 3M: Multi-style image caption generation using Multi-modality features under Multi-UPDOWN model | Mar 20, 2021 | Caption GenerationImage Captioning | —Unverified | 0 |