| UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling | Nov 23, 2021 | Image CaptioningImage Description | CodeCode Available | 1 |
| DialogCC: An Automated Pipeline for Creating High-Quality Multi-Modal Dialogue Dataset | Dec 8, 2022 | DiversityImage Description | CodeCode Available | 1 |
| CIDEr: Consensus-based Image Description Evaluation | Nov 20, 2014 | Action RecognitionAttribute | CodeCode Available | 1 |
| Revisiting Binary Local Image Description for Resource Limited Devices | Aug 18, 2021 | Image DescriptionTriplet | CodeCode Available | 1 |
| Curriculum Learning for Multi-Task Classification of Visual Attributes | Aug 29, 2017 | AttributeClassification | —Unverified | 0 |
| Computer Vision and Conflicting Values: Describing People with Automated Alt Text | May 26, 2021 | Image Description | —Unverified | 0 |
| A Fine-Grained Image Description Generation Method Based on Joint Objectives | Sep 2, 2023 | Image DescriptionObject | —Unverified | 0 |
| A Genetic Algorithm Approach for ImageRepresentation Learning through Color Quantization | Nov 18, 2017 | Content-Based Image RetrievalImage Description | —Unverified | 0 |
| A Thousand Frames in Just a Few Words: Lingual Description of Videos through Latent Topics and Sparse Object Stitching | Jun 1, 2013 | Image DescriptionVideo Description | —Unverified | 0 |
| Curriculum Learning of Visual Attribute Clusters for Multi-Task Classification | Sep 19, 2017 | AttributeClassification | —Unverified | 0 |