| Multimodal Neural Machine Translation for Low-resource Language Pairs using Synthetic Data | Jul 1, 2018 | Image DescriptionMachine Translation | —Unverified | 0 | 0 |
| Weakly Supervised Learning of Objects, Attributes and their Associations | Mar 31, 2015 | AttributeImage Description | —Unverified | 0 | 0 |
| Neural Dependency Coding inspired Multimodal Fusion | Sep 28, 2021 | Emotion RecognitionImage Description | —Unverified | 0 | 0 |
| Unsupervised Stylish Image Description Generation via Domain Layer Norm | Sep 11, 2018 | Image Description | —Unverified | 0 | 0 |
| On the Use of Deep Learning for Blind Image Quality Assessment | Feb 17, 2016 | Blind Image Quality AssessmentImage Description | —Unverified | 0 | 0 |
| On the use of human reference data for evaluating automatic image descriptions | Jun 15, 2020 | Image Description | —Unverified | 0 | 0 |
| Artwork Explanation in Large-scale Vision Language Models | Feb 29, 2024 | Explanation GenerationImage Description | —Unverified | 0 | 0 |
| ParaCNN: Visual Paragraph Generation via Adversarial Twin Contextual CNNs | Apr 21, 2020 | Image CaptioningImage Description | —Unverified | 0 | 0 |
| Personalizing Multimodal Large Language Models for Image Captioning: An Experimental Analysis | Dec 4, 2024 | Image CaptioningImage Description | —Unverified | 0 | 0 |
| phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning | Aug 20, 2016 | Image CaptioningImage Description | —Unverified | 0 | 0 |
| Phrase-based Image Captioning with Hierarchical LSTM Model | Nov 11, 2017 | DecoderImage Captioning | —Unverified | 0 | 0 |
| Place recognition: An Overview of Vision Perspective | Jun 17, 2017 | image-classificationImage Classification | —Unverified | 0 | 0 |
| Place recognition in gardens by learning visual representations: data set and benchmark analysis | Jun 28, 2019 | Camera LocalizationImage Description | —Unverified | 0 | 0 |
| Pragmatic descriptions of perceptual stimuli | Apr 1, 2017 | Image DescriptionObject Recognition | —Unverified | 0 | 0 |
| A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision-Language Models | Feb 28, 2024 | Image DescriptionQuestion Answering | —Unverified | 0 | 0 |
| A Preliminary Survey of Semantic Descriptive Model for Images | Jan 13, 2025 | DescriptiveImage Description | —Unverified | 0 | 0 |
| Recurrent Attention Unit | Oct 30, 2018 | General ClassificationHandwriting Recognition | —Unverified | 0 | 0 |
| Recurrent Image Captioner: Describing Images with Spatial-Invariant Transformation and Attention Filtering | Dec 15, 2016 | DecoderImage Captioning | —Unverified | 0 | 0 |
| Recurrent Topic-Transition GAN for Visual Paragraph Generation | Mar 21, 2017 | Generative Adversarial NetworkImage Description | —Unverified | 0 | 0 |
| Annotation Methodologies for Vision and Language Dataset Creation | Jul 10, 2016 | Action RecognitionImage Description | —Unverified | 0 | 0 |
| WIDIn: Wording Image for Domain-Invariant Representation in Single-Source Domain Generalization | May 28, 2024 | Domain GeneralizationImage Description | —Unverified | 0 | 0 |
| VIFIDEL: Evaluating the Visual Fidelity of Image Descriptions | Jul 22, 2019 | Image DescriptionSemantic Similarity | —Unverified | 0 | 0 |
| SafeAccess+: An Intelligent System to make Smart Home Safer and Americans with Disability Act Compliant | Sep 14, 2021 | Image Description | —Unverified | 0 | 0 |
| Cross-validating Image Description Datasets and Evaluation Metrics | May 1, 2016 | Image DescriptionSentence | —Unverified | 0 | 0 |
| Curriculum Learning for Multi-Task Classification of Visual Attributes | Aug 29, 2017 | AttributeClassification | —Unverified | 0 | 0 |
| Curriculum Learning of Visual Attribute Clusters for Multi-Task Classification | Sep 19, 2017 | AttributeClassification | —Unverified | 0 | 0 |
| Customized Image Narrative Generation via Interactive Visual Question Generation and Answering | Apr 27, 2018 | DiversityImage Description | —Unverified | 0 | 0 |
| Cross Modification Attention Based Deliberation Model for Image Captioning | Sep 17, 2021 | DecoderDescriptive | —Unverified | 0 | 0 |
| A Hierarchical Approach for Visual Storytelling Using Image Description | Sep 26, 2019 | DecoderImage Description | —Unverified | 0 | 0 |
| Computer Vision and Conflicting Values: Describing People with Automated Alt Text | May 26, 2021 | Image Description | —Unverified | 0 | 0 |
| DIDEC: The Dutch Image Description and Eye-tracking Corpus | Aug 1, 2018 | Image DescriptionSpecificity | —Unverified | 0 | 0 |
| DiffCap: Exploring Continuous Diffusion on Image Captioning | May 20, 2023 | Caption GenerationDiversity | —Unverified | 0 | 0 |
| Seeing the Unseen: Visual Common Sense for Semantic Placement | Jan 15, 2024 | Common Sense ReasoningImage Description | —Unverified | 0 | 0 |
| Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space | Nov 19, 2017 | Caption GenerationImage Description | —Unverified | 0 | 0 |
| Sequential Attention GAN for Interactive Image Editing | Dec 20, 2018 | Image DescriptionImage Generation | —Unverified | 0 | 0 |
| Don't Mention the Shoe! A Learning to Rank Approach to Content Selection for Image Description Generation | Sep 1, 2016 | Image DescriptionImage Retrieval | —Unverified | 0 | 0 |
| Doubly-Attentive Decoder for Multi-modal Neural Machine Translation | Feb 4, 2017 | DecoderImage Description | —Unverified | 0 | 0 |
| Draw and Tell: Multimodal Descriptions Outperform Verbal- or Sketch-Only Descriptions in an Image Retrieval Task | Nov 1, 2017 | Image DescriptionImage Retrieval | —Unverified | 0 | 0 |
| Simple Image Description Generator via a Linear Phrase-Based Approach | Dec 29, 2014 | DescriptiveImage Description | —Unverified | 0 | 0 |
| E-PUR: An Energy-Efficient Processing Unit for Recurrent Neural Networks | Nov 20, 2017 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| EPYNET: Efficient Pyramidal Network for Clothing Segmentation | Oct 13, 2020 | Data AugmentationImage Description | —Unverified | 0 | 0 |
| Exemplar SVMs as Visual Feature Encoders | Jun 1, 2015 | image-classificationImage Classification | —Unverified | 0 | 0 |
| Exploring the Behavior of Classic REG Algorithms in the Description of Characters in 3D Images | Sep 1, 2017 | Image DescriptionReferring Expression | —Unverified | 0 | 0 |
| Exploring the Use of Contrastive Language-Image Pre-Training for Human Posture Classification: Insights from Yoga Pose Analysis | Jan 13, 2025 | Image DescriptionTransfer Learning | —Unverified | 0 | 0 |
| Exploring Visual Relationship for Image Captioning | Sep 19, 2018 | DecoderImage Captioning | —Unverified | 0 | 0 |
| Zero-Resource Neural Machine Translation with Multi-Agent Communication Game | Feb 9, 2018 | DecoderImage Captioning | —Unverified | 0 | 0 |
| Face2Text revisited: Improved data set and baseline results | May 24, 2022 | Image DescriptionTransfer Learning | —Unverified | 0 | 0 |
| Facial Expression Recognition and Image Description Generation in Vietnamese | Aug 12, 2022 | DescriptiveEmotion Recognition | —Unverified | 0 | 0 |
| Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition | Apr 23, 2017 | AttributeImage Captioning | —Unverified | 0 | 0 |
| Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description | Oct 19, 2017 | Image DescriptionMachine Translation | —Unverified | 0 | 0 |