| Video Captioning in Compressed Video | Jan 2, 2021 | Caption GenerationVideo Captioning | —Unverified | 0 | 0 |
| Video Captioning with Guidance of Multimodal Latent Topics | Aug 31, 2017 | Caption GenerationDecoder | —Unverified | 0 | 0 |
| Vision-Language Modeling Meets Remote Sensing: Models, Datasets and Perspectives | May 20, 2025 | Caption GenerationContrastive Learning | —Unverified | 0 | 0 |
| Visual Analytics for Efficient Image Exploration and User-Guided Image Captioning | Nov 2, 2023 | Caption GenerationEfficient Exploration | —Unverified | 0 | 0 |
| Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation | Apr 30, 2024 | Caption GenerationHallucination | —Unverified | 0 | 0 |
| WAT2019: English-Hindi Translation on Hindi Visual Genome Dataset | Nov 1, 2019 | Caption GenerationTranslation | —Unverified | 0 | 0 |
| Denoising Large-Scale Image Captioning from Alt-text Data using Content Selection Models | Sep 10, 2020 | Caption GenerationDenoising | —Unverified | 0 | 0 |
| Weakly Supervised Dense Video Captioning via Jointly Usage of Knowledge Distillation and Cross-modal Matching | May 18, 2021 | Caption GenerationCross-Modal Retrieval | —Unverified | 0 | 0 |
| What is not where: the challenge of integrating spatial representations into deep learning architectures | Jul 21, 2018 | Caption GenerationDeep Learning | —Unverified | 0 | 0 |
| Word to Sentence Visual Semantic Similarity for Caption Generation: Lessons Learned | Sep 26, 2022 | Caption GenerationSemantic Similarity | —Unverified | 0 | 0 |