SODA: Story Oriented Dense Video Captioning Evaluation Framework Aug 1, 2020 Dense Video Captioning Video Captioning
Code Code Available 1SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning Nov 25, 2021 Caption Generation Question Answering
Code Code Available 1Improving Generation and Evaluation of Visual Stories via Semantic Consistency May 20, 2021 Image Generation Story Visualization
Code Code Available 1Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data Jan 16, 2024 Image Generation Text to Image Generation
Code Code Available 1Comprehensive Information Integration Modeling Framework for Video Titling Jun 24, 2020 Descriptive Video Captioning
Code Code Available 1TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks Nov 23, 2020 Action Classification Action Localization
Code Code Available 1AlanaVLM: A Multimodal Embodied AI Foundation Model for Egocentric Video Understanding Jun 19, 2024 Question Answering Spatial Reasoning
Code Code Available 1Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners May 22, 2022 Attribute Automatic Speech Recognition
Code Code Available 1LLMVA-GEBC: Large Language Model with Video Adapter for Generic Event Boundary Captioning Jun 17, 2023 Boundary Captioning Language Modeling
Code Code Available 1Hierarchical Modular Network for Video Captioning Nov 24, 2021 Representation Learning Sentence
Code Code Available 1A Comprehensive Review of the Video-to-Text Problem Mar 27, 2021 Question Answering Retrieval
Code Code Available 1Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation Aug 17, 2016 Caption Generation Decoder
Code Code Available 1Hierarchical Video-Moment Retrieval and Step-Captioning Mar 29, 2023 Information Retrieval Moment Retrieval
Code Code Available 1VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Captioning Jan 12, 2025 Dense Video Captioning Video Captioning
Code Code Available 1HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training May 1, 2020 Language Modeling Language Modelling
Code Code Available 1EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching Nov 17, 2021 Language Modelling Video Captioning
Code Code Available 1Learning to Generate Grounded Visual Captions without Localization Supervision Jun 1, 2019 Image Captioning Language Modelling
Code Code Available 1HiCM^2: Hierarchical Compact Memory Modeling for Dense Video Captioning Dec 19, 2024 Dense Video Captioning Video Captioning
Code Code Available 1Large Scale Holistic Video Understanding Apr 25, 2019 Action Classification Action Recognition
Code Code Available 1GOAL: A Challenging Knowledge-grounded Video Captioning Benchmark for Real-time Soccer Commentary Generation Mar 26, 2023 Video Captioning
Code Code Available 1GL-RG: Global-Local Representation Granularity for Video Captioning May 22, 2022 Caption Generation Descriptive
Code Code Available 1A Reinforcement Learning Based Encoder-Decoder Framework for Learning Stock Trading Rules Jan 8, 2021 Decoder Deep Reinforcement Learning
Code Code Available 1Improved Actor Relation Graph based Group Activity Recognition Oct 24, 2020 Activity Recognition Group Activity Recognition
Code Code Available 1Accurate and Fast Compressed Video Captioning Sep 22, 2023 Video Captioning
Code Code Available 1G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o Dec 18, 2024 Image Captioning Video Captioning
Code Code Available 1