| Microsoft COCO Captions: Data Collection and Evaluation Server | Apr 1, 2015 | Caption Generation | CodeCode Available | 1 | 5 |
| Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts | Feb 17, 2021 | Caption GenerationDiversity | CodeCode Available | 1 | 5 |
| Connecting What to Say With Where to Look by Modeling Human Attention Traces | May 12, 2021 | Caption GenerationImage Captioning | CodeCode Available | 1 | 5 |
| MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response | Sep 15, 2023 | Caption GenerationLanguage Modelling | CodeCode Available | 1 | 5 |
| COSMic: A Coherence-Aware Generation Metric for Image Descriptions | Sep 11, 2021 | Caption GenerationImage Captioning | CodeCode Available | 1 | 5 |
| Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension | Oct 18, 2024 | Caption Generation | CodeCode Available | 1 | 5 |
| Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network | Dec 13, 2020 | Caption GenerationDecoder | CodeCode Available | 1 | 5 |
| Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs | Mar 1, 2020 | AttributeCaption Generation | CodeCode Available | 1 | 5 |
| Belief Revision based Caption Re-ranker with Visual Semantic Information | Sep 16, 2022 | Caption GenerationImage Captioning | CodeCode Available | 1 | 5 |
| Team RUC_AIM3 Technical Report at ActivityNet 2021: Entities Object Localization | Jun 11, 2021 | Caption GenerationObject | CodeCode Available | 1 | 5 |