| Mind's Eye: A Recurrent Visual Representation for Image Caption Generation | Jun 1, 2015 | Caption GenerationImage Description | —Unverified | 0 |
| Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models | May 19, 2015 | Image DescriptionPhrase Grounding | CodeCode Available | 1 |
| Weakly Supervised Learning of Objects, Attributes and their Associations | Mar 31, 2015 | AttributeImage Description | —Unverified | 0 |
| Describing Videos by Exploiting Temporal Structure | Feb 27, 2015 | Action RecognitionImage Description | CodeCode Available | 0 |
| Simple Image Description Generator via a Linear Phrase-Based Approach | Dec 29, 2014 | DescriptiveImage Description | —Unverified | 0 |
| The Treasure beneath Convolutional Layers: Cross-convolutional-layer Pooling for Image Classification | Nov 27, 2014 | General Classificationimage-classification | CodeCode Available | 0 |
| CIDEr: Consensus-based Image Description Evaluation | Nov 20, 2014 | Action RecognitionAttribute | CodeCode Available | 1 |
| Long-term Recurrent Convolutional Networks for Visual Recognition and Description | Nov 17, 2014 | Image DescriptionRetrieval | CodeCode Available | 0 |
| Collecting Image Description Datasets using Crowdsourcing | Nov 12, 2014 | Image DescriptionSentence | —Unverified | 0 |
| Comparing Automatic Evaluation Measures for Image Description | Jun 1, 2014 | Image DescriptionSlot Filling | —Unverified | 0 |