| SPIDER: A Comprehensive Multi-Organ Supervised Pathology Dataset and Baseline Models | Mar 4, 2025 | Image Description | CodeCode Available | 1 | 5 |
| Chatting Makes Perfect: Chat-based Image Retrieval | May 31, 2023 | Chat-based Image RetrievalImage Description | CodeCode Available | 1 | 5 |
| CIDEr: Consensus-based Image Description Evaluation | Nov 20, 2014 | Action RecognitionAttribute | CodeCode Available | 1 | 5 |
| Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models | May 19, 2015 | Image DescriptionPhrase Grounding | CodeCode Available | 1 | 5 |
| ContextRef: Evaluating Referenceless Metrics For Image Description Generation | Sep 21, 2023 | Image Description | CodeCode Available | 0 | 5 |
| Human Attention in Image Captioning: Dataset and Analysis | Mar 6, 2019 | Image CaptioningImage Description | CodeCode Available | 0 | 5 |
| Compositional Obverter Communication Learning From Raw Visual Input | Apr 6, 2018 | Image Description | CodeCode Available | 0 | 5 |
| Generating Image Descriptions via Sequential Cross-Modal Alignment Guided by Human Gaze | Nov 9, 2020 | cross-modal alignmentImage Captioning | CodeCode Available | 0 | 5 |
| How Do Image Description Systems Describe People? A Targeted Assessment of System Competence in the PEOPLE-domain | Dec 1, 2020 | Image Description | CodeCode Available | 0 | 5 |
| Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions | Mar 10, 2018 | Image DescriptionImage to text | CodeCode Available | 0 | 5 |