| Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures | Jan 15, 2016 | Image DescriptionRetrieval | —Unverified | 0 | 0 |
| A Thousand Frames in Just a Few Words: Lingual Description of Videos through Latent Topics and Sparse Object Stitching | Jun 1, 2013 | Image DescriptionVideo Description | —Unverified | 0 | 0 |
| A Shared Task on Multimodal Machine Translation and Crosslingual Image Description | Aug 1, 2016 | Image DescriptionImage Retrieval | —Unverified | 0 | 0 |
| Data-augmented phrase-level alignment for mitigating object hallucination | May 28, 2024 | Data AugmentationHallucination | —Unverified | 0 | 0 |
| Adding the Third Dimension to Spatial Relation Detection in 2D Images | Nov 1, 2018 | Image DescriptionObject | —Unverified | 0 | 0 |
| Multilingual Image Corpus – Towards a Multimodal and Multilingual Dataset | Jun 1, 2022 | Caption Generationimage-classification | —Unverified | 0 | 0 |
| TypeScore: A Text Fidelity Metric for Text-to-Image Generative Models | Nov 2, 2024 | Image DescriptionImage Generation | —Unverified | 0 | 0 |
| Multimodal fusion via cortical network inspired losses | May 1, 2022 | Emotion RecognitionImage Description | —Unverified | 0 | 0 |
| Multi-modal gated recurrent units for image description | Apr 20, 2019 | Image DescriptionSentence | —Unverified | 0 | 0 |
| Multimodal Machine Translation with Reinforcement Learning | May 7, 2018 | Image DescriptionMachine Translation | —Unverified | 0 | 0 |