Temporal Tessellation: A Unified Approach for Video Analysis Dec 21, 2016 Action Detection Video Captioning
Code Code Available 05 Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning Mar 31, 2018 Decoder Dense Video Captioning
Code Code Available 05 Top-down Visual Saliency Guided by Captions Dec 21, 2016 Decoder Sentence
Code Code Available 05 Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning May 3, 2019 Decoder Sentence
Code Code Available 05 Towards Automatic Learning of Procedures from Web Instructional Videos Mar 28, 2017 Dense Video Captioning Procedure Learning
Code Code Available 05 Visual Transformation Telling May 3, 2023 Dense Video Captioning Video Captioning
Code Code Available 05 Streaming Dense Video Captioning Apr 1, 2024 Dense Video Captioning Live Video Captioning
Code Code Available 05 https://arxiv.org/abs/2407.00634 Jul 2, 2024 Video Captioning Video Description
Code Code Available 05 Streamlined Dense Video Captioning Apr 8, 2019 Dense Video Captioning Reinforcement Learning
Code Code Available 05 SoccerNet 2024 Challenges Results Sep 16, 2024 Action Spotting Dense Video Captioning
Code Code Available 05 Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data Nov 17, 2015 Image Captioning Novel Concepts
Code Code Available 05 BERTHA: Video Captioning Evaluation Via Transfer-Learned Human Assessment Jan 25, 2022 Language Modeling Language Modelling
Code Code Available 05 Support-set based Multi-modal Representation Enhancement for Video Captioning May 19, 2022 Video Captioning
Code Code Available 05 Sketch, Ground, and Refine: Top-Down Dense Video Captioning Jun 19, 2021 Dense Video Captioning Sentence
Code Code Available 05 Event and Entity Extraction from Generated Video Captions Nov 5, 2022 Caption Generation Dense Video Captioning
Code Code Available 05 Screencast Tutorial Video Understanding Jun 1, 2020 object-detection Object Detection
Code Code Available 05 Video captioning with stacked attention and semantic hard pull Sep 15, 2020 Decoder Video Captioning
Code Code Available 05 Sensor-Augmented Egocentric-Video Captioning with Dynamic Modal Attention Sep 7, 2021 Sensor Fusion Video Captioning
Code Code Available 05 Cross-Modal and Hierarchical Modeling of Video and Text Oct 16, 2018 Action Recognition Retrieval
Code Code Available 05 Hierarchical Banzhaf Interaction for General Video-Language Representation Learning Dec 30, 2024 Contrastive Learning Question Answering
Code Code Available 05 Reconstruction Network for Video Captioning Mar 30, 2018 Decoder Sentence
Code Code Available 05 Pseudo-labeling with Keyword Refining for Few-Supervised Video Captioning Nov 6, 2024 Video Captioning
Code Code Available 05 Refined Semantic Enhancement towards Frequency Diffusion for Video Captioning Nov 28, 2022 FAD Video Captioning
Code Code Available 05 Pretrained Image-Text Models are Secretly Video Captioners Feb 19, 2025 Image Captioning Video Captioning
Code Code Available 05 Analyzing Zero-Shot Abilities of Vision-Language Models on Video Understanding Tasks Oct 7, 2023 Action Recognition Multiple-choice
Code Code Available 05 OSVidCap: A Framework for the Simultaneous Recognition and Description of Concurrent Actions in Videos in an Open-Set Scenario Sep 29, 2021 Decoder Open Set Video Captioning
Code Code Available 05 Cross-Modal Graph with Meta Concepts for Video Captioning Aug 14, 2021 object-detection Object Detection
Code Code Available 05 ActBERT: Learning Global-Local Video-Text Representations Nov 14, 2020 Action Segmentation Question Answering
Code Code Available 05 Controllable Video Captioning with POS Sequence Guidance Based on Gated Fusion Network Aug 27, 2019 Caption Generation Decoder
Code Code Available 05 Oracle performance for visual captioning Nov 14, 2015 Image Captioning Language Modeling
Code Code Available 05 NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and Online Learning Jul 9, 2018 General Classification Machine Translation
Code Code Available 05 M-VAD Names: a Dataset for Video Captioning with Naming Mar 4, 2019 TAG Video Captioning
Code Code Available 05 Non-Autoregressive Coarse-to-Fine Video Captioning Nov 27, 2019 Sentence Video Captioning
Code Code Available 05 Continual and Multi-Task Architecture Search Jun 12, 2019 Continual Learning General Classification
Code Code Available 05 Deep Learning for Video Classification and Captioning Sep 22, 2016 Classification Deep Learning
Code Code Available 05 Contextual Explainable Video Representation: Human Perception-based Understanding Dec 12, 2022 Action Detection Action Recognition
Code Code Available 05 FocusedAD: Character-centric Movie Audio Description Apr 16, 2025 Video Captioning
Code Code Available 05 FLASH: Latent-Aware Semi-Autoregressive Speculative Decoding for Multimodal Tasks May 19, 2025 Video Captioning
Code Code Available 05 Delving Deeper into Convolutional Networks for Learning Video Representations Nov 19, 2015 Action Recognition Decoder
Code Code Available 05 StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation Sep 13, 2022 Image Generation Story Continuation
Code Code Available 05 ContCap: A scalable framework for continual image captioning Sep 19, 2019 Continual Learning Image Captioning
Code Code Available 05 MTLE: A Multitask Learning Encoder of Visual Feature Representations for Video and Movie Description Sep 19, 2018 Decoder Video Captioning
Code Code Available 05 A Survey of Video Datasets for Grounded Event Understanding Jun 14, 2024 Common Sense Reasoning Event Extraction
Code Code Available 05 Multi-attention Networks for Temporal Localization of Video-level Labels Nov 15, 2019 Action Recognition Temporal Action Localization
Code Code Available 05 MSVD-Indonesian: A Benchmark for Multimodal Video-Text Tasks in Indonesian Jun 20, 2023 Cross-Lingual Transfer Retrieval
Code Code Available 05 OmniNet: A unified architecture for multi-modal multi-task learning Jul 17, 2019 Image Captioning Multi-Task Learning
Code Code Available 05 FIBER: Fill-in-the-Blanks as a Challenging Video Understanding Evaluation Framework Apr 9, 2021 Language Modelling Multiple-choice
Code Code Available 05 A Semantics-Assisted Video Captioning Model Trained with Scheduled Sampling Aug 31, 2019 Sentence Video Captioning
Code Code Available 05 Membership Inference Attacks on Sequence-to-Sequence Models: Is My Data In Your Machine Translation System? Apr 11, 2019 Machine Translation Translation
Code Code Available 05 Meaning guided video captioning Dec 12, 2019 Decoder object-detection
Code Code Available 05