Interactive-predictive neural multimodal systems May 30, 2019 Machine Translation Translation
— Unverified 0Interpretable Video Captioning via Trajectory Structured Localization Jun 1, 2018 Decoder Image Captioning
— Unverified 0iPerceive: Applying Common-Sense Reasoning to Multi-Modal Dense Video Captioning and Video Question Answering Nov 16, 2020 Common Sense Reasoning Dense Video Captioning
— Unverified 0iReason: Multimodal Commonsense Reasoning using Videos and Natural Language with Interpretability Jun 25, 2021 Bias Detection Question Answering
— Unverified 0It's Just Another Day: Unique Video Captioning by Discriminative Prompting Oct 15, 2024 Video Captioning
— Unverified 0Jointly Localizing and Describing Events for Dense Video Captioning Apr 23, 2018 Attribute Dense Video Captioning
— Unverified 0Joint Syntax Representation Learning and Visual Cue Translation for Video Captioning Oct 1, 2019 POS POS Tagging
— Unverified 0Knowledge Distillation for Efficient Audio-Visual Video Captioning Jun 16, 2023 Audio-Visual Video Captioning Caption Generation
— Unverified 0Knowledge Guided Entity-aware Video Captioning and A Basketball Benchmark Jan 25, 2024 Decoder Video Captioning
— Unverified 0LASER: A Neuro-Symbolic Framework for Learning Spatial-Temporal Scene Graphs with Weak Supervision Apr 15, 2023 Language Modeling Language Modelling
— Unverified 0Rethinking and Improving Natural Language Generation with Layer-Wise Multi-View Decoding May 16, 2020 Abstractive Text Summarization Decoder
— Unverified 0Learning Actions from Human Demonstration Video for Robotic Manipulation Sep 10, 2019 Video Captioning
— Unverified 0Recurrent Memory Addressing for describing videos Nov 20, 2016 Video Captioning
— Unverified 0Reexamining Racial Disparities in Automatic Speech Recognition Performance: The Role of Confounding by Provenance Jul 19, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0ReGen: A good Generative Zero-Shot Video Classifier Should be Rewarded Jan 1, 2023 Action Classification Action Recognition
— Unverified 0Reinforced Video Captioning with Entailment Rewards Aug 7, 2017 reinforcement-learning Reinforcement Learning
— Unverified 0Relational Reasoning using Prior Knowledge for Visual Captioning Jun 4, 2019 Image Captioning object-detection
— Unverified 0Retrieval-Augmented Egocentric Video Captioning Jan 1, 2024 Representation Learning Retrieval
— Unverified 0RETTA: Retrieval-Enhanced Test-Time Adaptation for Zero-Shot Video Captioning May 11, 2024 Image-text matching Retrieval
— Unverified 0RUC+CMU: System Report for Dense Captioning Events in Videos Jun 22, 2018 Caption Generation Dense Captioning
— Unverified 0SACT: Self-Aware Multi-Space Feature Composition Transformer for Multinomial Attention for Video Captioning Jun 25, 2020 Dense Video Captioning Video Captioning
— Unverified 0SAVCHOI: Detecting Suspicious Activities using Dense Video Captioning with Human Object Interactions Jul 24, 2022 Dense Captioning Dense Video Captioning
— Unverified 0SBAT: Video Captioning with Sparse Boundary-Aware Transformer Jul 23, 2020 Machine Translation multimodal interaction
— Unverified 0Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data Apr 4, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Semantic-Aware Pretraining for Dense Video Captioning Apr 13, 2022 Dense Captioning Dense Video Captioning
— Unverified 0Semi-Supervised Learning for Video Captioning Nov 1, 2020 Video Captioning
— Unverified 0SEM-POS: Grammatically and Semantically Correct Video Captioning Mar 26, 2023 POS Video Captioning
— Unverified 0Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding Nov 25, 2024 Dense Video Captioning Transfer Learning
— Unverified 0Set Prediction Guided by Semantic Concepts for Diverse Video Captioning Dec 25, 2023 Caption Generation Diversity
— Unverified 0Show, Tell and Summarize: Dense Video Captioning Using Visual Cue Aided Sentence Summarization Jun 25, 2025 Dense Video Captioning Descriptive
— Unverified 0SMArT: Training Shallow Memory-aware Transformers for Robotic Explainability Oct 7, 2019 Text Generation Video Captioning
— Unverified 0SnapCap: Efficient Snapshot Compressive Video Captioning Jan 10, 2024 Compressive Sensing Video Captioning
— Unverified 0Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation Mar 8, 2024 Articles Hallucination
— Unverified 0Sparse Graph to Sequence Learning for Vision Conditioned Long Textual Sequence Generation Jul 12, 2020 Decoder Graph-to-Sequence
— Unverified 0Spatio-Temporal Attention Models for Grounded Video Captioning Oct 17, 2016 image-classification Image Classification
— Unverified 0Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning Feb 27, 2019 Attribute Caption Generation
— Unverified 0Spatio-Temporal Ranked-Attention Networks for Video Captioning Jan 17, 2020 Video Captioning
— Unverified 0SPECTRUM: Semantic Processing and Emotion-informed video-Captioning Through Retrieval and Understanding Modalities Nov 4, 2024 Attribute Descriptive
— Unverified 0STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training Feb 20, 2023 Language Modelling Object
— Unverified 0Story Generation from Visual Inputs: Techniques, Related Tasks, and Challenges Jun 4, 2024 Question Answering Story Generation
— Unverified 0Storytelling of Photo Stream with Bidirectional Multi-thread Recurrent Neural Network Jun 2, 2016 Video Captioning Visual Storytelling
— Unverified 0Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos Jun 27, 2023 Multi-Task Learning Scene Understanding
— Unverified 0SOVC: Subject-Oriented Video Captioning Dec 20, 2023 Video Captioning
— Unverified 0Supervising Neural Attention Models for Video Captioning by Human Gaze Data Jul 19, 2017 Descriptive Gaze Prediction
— Unverified 0Task-Driven Dynamic Fusion: Reducing Ambiguity in Video Description Jul 1, 2017 Video Captioning Video Description
— Unverified 0TCR: Short Video Title Generation and Cover Selection with Attention Refinement Apr 25, 2023 Video Captioning
— Unverified 0Team RUC_AIM3 Technical Report at Activitynet 2020 Task 2: Exploring Sequential Events Detection for Dense Video Captioning Jun 14, 2020 Dense Captioning Dense Video Captioning
— Unverified 0Technical Report for Soccernet 2023 -- Dense Video Captioning Oct 31, 2024 Dense Video Captioning Video Captioning
— Unverified 0Temporally Grounding Natural Sentence in Video Oct 1, 2018 Sentence Video Captioning
— Unverified 0Temporal Object Captioning for Street Scene Videos from LiDAR Tracks May 22, 2025 Caption Generation Video Captioning
— Unverified 0