Syntax Customized Video Captioning by Imitating Exemplar Sentences Dec 2, 2021 Decoder Diversity
Code Code Available 05 Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning May 3, 2019 Decoder Sentence
Code Code Available 05 Temporal Tessellation: A Unified Approach for Video Analysis Dec 21, 2016 Action Detection Video Captioning
Code Code Available 05 Top-down Visual Saliency Guided by Captions Dec 21, 2016 Decoder Sentence
Code Code Available 05 Towards Automatic Learning of Procedures from Web Instructional Videos Mar 28, 2017 Dense Video Captioning Procedure Learning
Code Code Available 05 Tragedy Plus Time: Capturing Unintended Human Activities from Weakly-labeled Videos Apr 28, 2022 Action Understanding Video Captioning
Code Code Available 05 Translating Videos to Natural Language Using Deep Recurrent Neural Networks Dec 15, 2014 Sentence Text Generation
Code Code Available 05 VideoBERT: A Joint Model for Video and Language Representation Learning Apr 3, 2019 Action Classification General Classification
Code Code Available 05 Video Description using Bidirectional Recurrent Neural Networks Apr 12, 2016 Decoder Text Generation
Code Code Available 05 Video Summarization: Towards Entity-Aware Captions Dec 1, 2023 Image Captioning Video Captioning
Code Code Available 05 Visual Transformation Telling May 3, 2023 Dense Video Captioning Video Captioning
Code Code Available 05 VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding May 20, 2021 Action Segmentation Language Modeling
Code Code Available 05 Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning Apr 15, 2018 Video Captioning Video Understanding
Code Code Available 05 Knowledge Guided Entity-aware Video Captioning and A Basketball Benchmark Jan 25, 2024 Decoder Video Captioning
— Unverified 00 Fine-Grained Video Captioning for Sports Narrative Jun 1, 2018 2k Video Captioning
— Unverified 00 LASER: A Neuro-Symbolic Framework for Learning Spatial-Temporal Scene Graphs with Weak Supervision Apr 15, 2023 Language Modeling Language Modelling
— Unverified 00 Fine-grained length controllable video captioning with ordinal embeddings Aug 27, 2024 Video Captioning
— Unverified 00 Rethinking and Improving Natural Language Generation with Layer-Wise Multi-View Decoding May 16, 2020 Abstractive Text Summarization Decoder
— Unverified 00 Learning Actions from Human Demonstration Video for Robotic Manipulation Sep 10, 2019 Video Captioning
— Unverified 00 Learning Audio-Video Modalities from Image Captions Apr 1, 2022 Image Captioning Retrieval
— Unverified 00 FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity Nov 23, 2024 Attribute Cross-Modal Retrieval
— Unverified 00 Learning Interactive Real-World Simulators Oct 9, 2023 Video Captioning
— Unverified 00 Fill-in-the-Blank: A Challenging Video Understanding Evaluation Framework Nov 16, 2021 Multiple-choice Question Answering
— Unverified 00 Learning to Compose Topic-Aware Mixture of Experts for Zero-Shot Video Captioning Nov 7, 2018 Mixture-of-Experts Video Captioning
— Unverified 00 Fighting FIRe with FIRE: Assessing the Validity of Text-to-Video Retrieval Benchmarks Oct 10, 2022 Retrieval Text to Video Retrieval
— Unverified 00 Exploring the Role of Audio in Video Captioning Jun 21, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 00 Less Is More: Picking Informative Frames for Video Captioning Mar 5, 2018 Decoder Diversity
— Unverified 00 Video Captioning with Text-based Dynamic Attention and Step-by-Step Learning Nov 5, 2019 Sentence Video Captioning
— Unverified 00 Exploring Temporal Event Cues for Dense Video Captioning in Cyclic Co-learning Dec 16, 2024 Contrastive Learning Dense Video Captioning
— Unverified 00 LongCaptioning: Unlocking the Power of Long Caption Generation in Large Multimodal Models Feb 21, 2025 Caption Generation Video Captioning
— Unverified 00 Video Captioning with Transferred Semantic Attributes Nov 23, 2016 Sentence Video Captioning
— Unverified 00 Low-Rank HOCA: Efficient High-Order Cross-Modal Attention for Video Captioning Nov 1, 2019 Decoder Tensor Decomposition
— Unverified 00 Exploring Group Video Captioning with Efficient Relational Approximation Jan 1, 2023 Video Captioning
— Unverified 00 Exploration of Visual Features and their weighted-additive fusion for Video Captioning Jan 14, 2021 Video Captioning
— Unverified 00 M3: Multimodal Memory Modelling for Video Captioning Jun 1, 2018 Sentence Video Captioning
— Unverified 00 Exploiting long-term temporal dynamics for video captioning Feb 22, 2022 Video Captioning
— Unverified 00 Attention is all you need for Videos: Self-attention based Video Summarization using Universal Transformers Jun 6, 2019 All Dense Video Captioning
— Unverified 00 MAMS: Model-Agnostic Module Selection Framework for Video Captioning Jan 30, 2025 Caption Generation Video Captioning
— Unverified 00 Exo2EgoDVC: Dense Video Captioning of Egocentric Procedural Activities Using Web Instructional Videos Nov 28, 2023 Dense Video Captioning Transfer Learning
— Unverified 00 MAViC: Multimodal Active Learning for Video Captioning Dec 11, 2022 Active Learning Decoder
— Unverified 00 MCF-VC: Mitigate Catastrophic Forgetting in Class-Incremental Learning for Multimodal Video Captioning Feb 27, 2024 class-incremental learning Class Incremental Learning
— Unverified 00 AdaCM^2: On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction Jan 1, 2025 GPU Question Answering
— Unverified 00 EVLM: An Efficient Vision-Language Model for Visual Understanding Jul 19, 2024 Image Captioning Language Modeling
— Unverified 00 Attention based video captioning framework for Hindi Jun 17, 2021 Video Captioning
— Unverified 00 Attention Based Encoder Decoder Model for Video Captioning in Nepali (2023) Dec 12, 2023 Decoder Video Captioning
— Unverified 00 Attend and Interact: Higher-Order Object Interactions for Video Understanding Nov 16, 2017 Action Classification Action Recognition
— Unverified 00 Middle-Out Decoding Oct 28, 2018 Decoder Diversity
— Unverified 00 MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models Oct 13, 2024 Cross-Modal Retrieval Question Answering
— Unverified 00 Modality Alignment between Deep Representations for Effective Video-and-Language Learning Jun 1, 2022 Question Answering Video Captioning
— Unverified 00 Models See Hallucinations: Evaluating the Factuality in Video Captioning Mar 6, 2023 Text Generation Video Captioning
— Unverified 00