| Generating Image Captions in Arabic using Root-Word Based Recurrent Neural Networks and Deep Neural Networks | Jun 1, 2018 | Caption GenerationImage Captioning | —Unverified | 0 | 0 |
| Generating captions without looking beyond objects | Oct 12, 2016 | Caption GenerationImage Captioning | —Unverified | 0 | 0 |
| GEM-VPC: A dual Graph-Enhanced Multimodal integration for Video Paragraph Captioning | Oct 12, 2024 | Caption GenerationDecoder | —Unverified | 0 | 0 |
| GC-KBVQA: A New Four-Stage Framework for Enhancing Knowledge Based Visual Question Answering Performance | May 25, 2025 | Caption GenerationQuestion Answering | —Unverified | 0 | 0 |
| Deep Bayesian Natural Language Processing | Jul 1, 2019 | Caption GenerationClustering | —Unverified | 0 | 0 |
| Bi-directional Contextual Attention for 3D Dense Captioning | Aug 13, 2024 | 3D dense captioningAttribute | —Unverified | 0 | 0 |
| Fusion Models for Improved Visual Captioning | Oct 28, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| DECap: Towards Generalized Explicit Caption Editing via Diffusion Mechanism | Nov 25, 2023 | Caption GenerationDenoising | —Unverified | 0 | 0 |
| D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding | Dec 2, 2021 | 3D dense captioning3D visual grounding | —Unverified | 0 | 0 |
| BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving | Jan 2, 2024 | Autonomous DrivingCaption Generation | —Unverified | 0 | 0 |