| Enhancing Image Caption Generation Using Reinforcement Learning with Human Feedback | Mar 11, 2024 | Caption Generationreinforcement-learning | —Unverified | 0 |
| FaceGemma: Enhancing Image Captioning with Facial Attributes for Portrait Images | Sep 24, 2023 | AttributeCaption Generation | —Unverified | 0 |
| Fast, Diverse and Accurate Image Captioning Guided By Part-of-Speech | May 31, 2018 | Caption GenerationDiversity | —Unverified | 0 |
| Fast Image Caption Generation with Position Alignment | Dec 13, 2019 | Caption GenerationDecoder | —Unverified | 0 |
| A Comparative Study of Pre-trained CNNs and GRU-Based Attention for Image Caption Generation | Oct 11, 2023 | Caption GenerationDecoder | —Unverified | 0 |
| Less for More: Enhanced Feedback-aligned Mixed LLMs for Molecule Caption Generation and Fine-Grained NLI Evaluation | May 22, 2024 | Caption GenerationHallucination | —Unverified | 0 |
| Hierarchical LSTMs with Adaptive Attention for Visual Captioning | Dec 26, 2018 | Caption GenerationImage Captioning | —Unverified | 0 |
| FE-LWS: Refined Image-Text Representations via Decoder Stacking and Fused Encodings for Remote Sensing Image Captioning | Feb 13, 2025 | Caption GenerationDecoder | —Unverified | 0 |
| Enhancing Chest X-ray Classification through Knowledge Injection in Cross-Modality Learning | Feb 19, 2025 | Caption GenerationClassification | —Unverified | 0 |
| Fine-Grained Video Captioning through Scene Graph Consolidation | Feb 23, 2025 | Caption GenerationImage Captioning | —Unverified | 0 |
| End to End Recognition System for Recognizing Offline Unconstrained Vietnamese Handwriting | May 14, 2019 | Caption GenerationDecoder | —Unverified | 0 |
| Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning | Jun 5, 2017 | Caption GenerationDecoder | —Unverified | 0 |
| Identifying Multi-modal Knowledge Neurons in Pretrained Transformers via Two-stage Filtering | Mar 29, 2025 | Caption Generationknowledge editing | —Unverified | 0 |
| Fusion Models for Improved Visual Captioning | Oct 28, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| GC-KBVQA: A New Four-Stage Framework for Enhancing Knowledge Based Visual Question Answering Performance | May 25, 2025 | Caption GenerationQuestion Answering | —Unverified | 0 |
| GEM-VPC: A dual Graph-Enhanced Multimodal integration for Video Paragraph Captioning | Oct 12, 2024 | Caption GenerationDecoder | —Unverified | 0 |
| Generating captions without looking beyond objects | Oct 12, 2016 | Caption GenerationImage Captioning | —Unverified | 0 |
| Generating Image Captions in Arabic using Root-Word Based Recurrent Neural Networks and Deep Neural Networks | Jun 1, 2018 | Caption GenerationImage Captioning | —Unverified | 0 |
| Generating image captions with external encyclopedic knowledge | Oct 10, 2022 | Caption GenerationImage Captioning | —Unverified | 0 |
| Generating Video Description using Sequence-to-sequence Model with Temporal Attention | Dec 1, 2016 | Caption GenerationSentence | —Unverified | 0 |
| Empirical Analysis of Image Caption Generation using Deep Learning | May 14, 2021 | Caption GenerationDecoder | —Unverified | 0 |
| Geometry-Entangled Visual Semantic Transformer for Image Captioning | Sep 29, 2021 | Caption GenerationImage Captioning | —Unverified | 0 |
| Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models | Nov 18, 2019 | Anomaly DetectionAutonomous Driving | —Unverified | 0 |
| E-MMAD: Multimodal Advertising Caption Generation Based on Structured Information | Nov 16, 2021 | Caption Generationvalid | —Unverified | 0 |
| Aligning Images and Text with Semantic Role Labels for Fine-Grained Cross-Modal Understanding | Jun 1, 2022 | Caption GenerationImage Retrieval | —Unverified | 0 |