| I2T2I: Learning Text to Image Synthesis with Textual Data Augmentation | Mar 20, 2017 | Caption GenerationData Augmentation | —Unverified | 0 | 0 |
| IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification | Mar 13, 2025 | Caption Generation | —Unverified | 0 | 0 |
| Identifying Multi-modal Knowledge Neurons in Pretrained Transformers via Two-stage Filtering | Mar 29, 2025 | Caption Generationknowledge editing | —Unverified | 0 | 0 |
| IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers | Nov 27, 2023 | Caption GenerationImage-text Retrieval | —Unverified | 0 | 0 |
| Image Caption Generation for Low-Resource Assamese Language | Nov 1, 2022 | Caption GenerationDecoder | —Unverified | 0 | 0 |
| Image Caption Generation Framework for Assamese News using Attention Mechanism | Dec 1, 2021 | Caption GenerationDecoder | —Unverified | 0 | 0 |
| Image Captioning using Facial Expression and Attention | Aug 8, 2019 | Caption GenerationImage Captioning | —Unverified | 0 | 0 |
| Image Captioning with Integrated Bottom-Up and Multi-level Residual Top-Down Attention for Game Scene Understanding | Jun 16, 2019 | Caption GenerationImage Captioning | —Unverified | 0 | 0 |
| Image Captioning with Unseen Objects | Jul 31, 2019 | Caption GenerationImage Captioning | —Unverified | 0 | 0 |
| Image Position Prediction in Multimodal Documents | May 1, 2020 | ArticlesCaption Generation | —Unverified | 0 | 0 |