Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding Sep 22, 2024 Anomaly Detection GPU
Code Code Available 4An Egocentric Vision-Language Model based Portable Real-time Smart Assistant Mar 6, 2025 Language Modeling Language Modelling
Code Code Available 2VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding May 22, 2024 Dense Video Captioning Highlight Detection
Code Code Available 2VideoSAGE: Video Summarization with Graph Representation Learning Apr 14, 2024 Graph Representation Learning Node Classification
Code Code Available 2ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video Jan 10, 2024 Video Summarization
Code Code Available 2UniVTG: Towards Unified Video-Language Temporal Grounding Jul 31, 2023 Highlight Detection Moment Retrieval
Code Code Available 2Egocentric Video-Language Pretraining Jun 3, 2022 Action Recognition Contrastive Learning
Code Code Available 2Do Language Models Understand Time? Dec 18, 2024 Action Recognition Anomaly Detection
Code Code Available 1Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark Dec 12, 2024 Highlight Detection Video Summarization
Code Code Available 1Shotluck Holmes: A Family of Efficient Small-Scale Large Language Vision Models For Video Captioning and Summarization May 31, 2024 Sentence Video Captioning
Code Code Available 1Shot2Story20K: A New Benchmark for Comprehensive Understanding of Multi-shot Videos Dec 16, 2023 Video Captioning video narration captioning
Code Code Available 1Adopting Self-Supervised Learning into Unsupervised Video Summarization through Restorative Score. Sep 11, 2023 Self-Supervised Learning Unsupervised Video Summarization
Code Code Available 1Adopting Self-Supervised Learning into Unsupervised Video Summarization through Restorative Score Sep 11, 2023 Self-Supervised Learning Unsupervised Video Summarization
Code Code Available 1EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone Jul 11, 2023 Action Recognition Moment Queries
Code Code Available 1MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos Jun 7, 2023 Text Summarization Video Summarization
Code Code Available 1Joint Moment Retrieval and Highlight Detection Via Natural Language Queries May 8, 2023 Decoder Highlight Detection
Code Code Available 1Hierarchical Video-Moment Retrieval and Step-Captioning Mar 29, 2023 Information Retrieval Moment Retrieval
Code Code Available 1VideoXum: Cross-modal Visual and Textural Summarization of Videos Mar 21, 2023 Text Summarization Video Summarization
Code Code Available 1Align and Attend: Multimodal Summarization with Dual Contrastive Losses Mar 13, 2023 Extractive Text Summarization Supervised Video Summarization
Code Code Available 1VideoSum: A Python Library for Surgical Video Summarization Feb 15, 2023 Video Summarization
Code Code Available 1Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization Nov 18, 2022 Diversity image-classification
Code Code Available 1Summarizing Videos using Concentrated Attention and Considering the Uniqueness and Diversity of the Video Frames Jun 29, 2022 Benchmarking Diversity
Code Code Available 1MHSCNet: A Multimodal Hierarchical Shot-aware Convolutional Network for Video Summarization Apr 18, 2022 Video Summarization
Code Code Available 1LTC-SUM: Lightweight Client-driven Personalized Video Summarization Framework Using 2D CNN Jan 22, 2022 Video Summarization
Code Code Available 1Progressive Video Summarization via Multimodal Self-supervised Learning Jan 7, 2022 Self-Supervised Learning Supervised Video Summarization
Code Code Available 1Video Joint Modelling Based on Hierarchical Transformer for Co-summarization Dec 27, 2021 Retrieval Supervised Video Summarization
Code Code Available 1Combining Global and Local Attention with Positional Encoding for Video Summarization Dec 1, 2021 Supervised Video Summarization Video Summarization
Code Code Available 1IntentVizor: Towards Generic Query Guided Interactive Video Summarization Sep 30, 2021 Video Summarization Video Understanding
Code Code Available 1Discriminative Latent Semantic Graph for Video Captioning Aug 8, 2021 Decoder Object
Code Code Available 1Self-Attention Recurrent Summarization Network with Reinforcement Learning for Video Summarization Task Jun 9, 2021 reinforcement-learning Reinforcement Learning
Code Code Available 1Multimodal Summarization of User-Generated Videos Jun 5, 2021 Video Summarization
Code Code Available 1Unsupervised Video Summarization via Multi-source Features May 26, 2021 Unsupervised Video Summarization Video Summarization
Code Code Available 1TRECVID 2020: A comprehensive campaign for evaluating video retrieval tasks across multiple application domains Apr 27, 2021 Ad-hoc video search Instance Search
Code Code Available 1Supervised Video Summarization via Multiple Feature Sets with Parallel Attention Apr 23, 2021 Automated Feature Engineering image-classification
Code Code Available 1A Comprehensive Review of the Video-to-Text Problem Mar 27, 2021 Question Answering Retrieval
Code Code Available 1Learning Discriminative Prototypes with Dynamic Time Warping Mar 17, 2021 Action Segmentation Dynamic Time Warping
Code Code Available 1Movie Summarization via Sparse Graph Construction Dec 14, 2020 graph construction Turning Point Identification
Code Code Available 1DSNet: A Flexible Detect-to-Summarize Network for Video Summarization Dec 1, 2020 regression Supervised Video Summarization
Code Code Available 1AC-SUM-GAN: Connecting Actor-Critic and Generative Adversarial Networks for Unsupervised Video Summarization Nov 16, 2020 Generative Adversarial Network Unsupervised Video Summarization
Code Code Available 1Multi-modal Summarization for Video-containing Documents Sep 17, 2020 Question Answering Video Summarization
Code Code Available 1Ultrasound Video Summarization using Deep Reinforcement Learning May 19, 2020 Deep Reinforcement Learning Diagnostic
Code Code Available 1Query-controllable Video Summarization Apr 7, 2020 Video Summarization
Code Code Available 1Convolutional Hierarchical Attention Network for Query-Focused Video Summarization Jan 31, 2020 Query focused video summarization Video Summarization
Code Code Available 1TRIM: A Self-Supervised Video Summarization Framework Maximizing Temporal Relative Information and Representativeness Jun 25, 2025 Self-Supervised Learning Supervised Video Summarization
— Unverified 0MF2Summ: Multimodal Fusion for Video Summarization with Temporal Alignment Jun 12, 2025 Video Summarization
— Unverified 0Prompts to Summaries: Zero-Shot Language-Guided Video Summarization Jun 12, 2025 GPU Query focused video summarization
— Unverified 0Enhancing Video Memorability Prediction with Text-Motion Cross-modal Contrastive Loss and Its Application in Video Summarization Jun 10, 2025 Prediction Video Summarization
— Unverified 0TriPSS: A Tri-Modal Keyframe Extraction Framework Using Perceptual, Structural, and Semantic Representations Jun 3, 2025 Retrieval Video Summarization
— Unverified 0Unsupervised Transcript-assisted Video Summarization and Highlight Detection May 29, 2025 Highlight Detection Reinforcement Learning (RL)
— Unverified 0REGen: Multimodal Retrieval-Embedded Generation for Long-to-Short Video Editing May 24, 2025 Language Modeling Language Modelling
— Unverified 0