SOTAVerified

Video Summarization

Video Summarization aims to generate a short synopsis that summarizes the video content by selecting its most informative and important parts. The produced summary is usually composed of a set of representative video frames (a.k.a. video key-frames), or video fragments (a.k.a. video key-fragments) that have been stitched in chronological order to form a shorter video. The former type of a video summary is known as video storyboard, and the latter type is known as video skim.

Source: Video Summarization Using Deep Neural Networks: A Survey Image credit: iJRASET

Papers

Showing 51100 of 280 papers

TitleStatusHype
A Paradigm for Building Generalized Models of Human Image Perception Through Data Fusion0
Creating Summaries from User Videos0
A Memory Network Approach for Story-based Temporal Summarization of 360° Videos0
Co-Regularized Deep Representations for Video Summarization0
Improving Sequential Determinantal Point Processes for Supervised Video Summarization0
DeepQAMVS: Query-Aware Hierarchical Pointer Networks for Multi-Video Summarization0
A Survey on Patch-based Synthesis: GPU Implementation and Optimization0
Demystifying Multi-Faceted Video Summarization: Tradeoff Between Diversity,Representation, Coverage and Importance0
Detecting Engagement in Egocentric Video0
Discovery of Shared Semantic Spaces for Multi-Scene Video Query and Summarization0
Attention is all you need for Videos: Self-attention based Video Summarization using Universal Transformers0
A Novel Trustworthy Video Summarization Algorithm Through a Mixture of LoRA Experts0
A Novel Approach for Robust Multi Human Action Recognition and Summarization based on 3D Convolutional Neural Networks0
Comprehensive Video Understanding: Video summarization with content-based video recommender design0
Gaze-Enabled Egocentric Video Summarization via Constrained Submodular Maximization0
FullTransNet: Full Transformer with Local-Global Attention for Video Summarization0
Compare and Select: Video Summarization with Multi-Agent Reinforcement Learning0
Image Conditioned Keyframe-Based Video Summarization Using Object Detection0
Key Frame Extraction with Attention Based Deep Neural Networks0
Common Action Discovery and Localization in Unconstrained Videos0
Submodular Maximization in Clean Linear Time0
How Good is a Video Summary? A New Benchmarking Dataset and Evaluation Framework Towards Realistic Video Summarization0
CNN-Based Prediction of Frame-Level Shot Importance for Video Summarization0
A Graph-based Ranking Approach to Extract Key-frames for Static Video Summarization0
Highlight Detection With Pairwise Deep Ranking for First-Person Video Summarization0
How Local is the Local Diversity? Reinforcing Sequential Determinantal Point Processes with Dynamic Ground Sets for Supervised Video Summarization0
CFSum: A Transformer-Based Multi-Modal Video Summarization Framework With Coarse-Fine Fusion0
A Dataset and Preliminary Results for Umpire Pose Detection Using SVM Classification of Deep Features0
Hierarchical Recurrent Neural Network for Video Summarization0
Causal Video Summarizer for Video Exploration0
Enhancing Video Summarization via Vision-Language Embedding0
Causalainer: Causal Explainer for Automatic Video Summarization0
Enhancing Video Memorability Prediction with Text-Motion Cross-modal Contrastive Loss and Its Application in Video Summarization0
A Novel Technique for Evidence based Conditional Inference in Deep Neural Networks via Latent Feature Perturbation0
Exploring Efficient Foundational Multi-modal Models for Video Summarization0
Exploring global diverse attention via pairwise temporal relation for video summarization0
Exploring Global Diversity and Local Context for Video Summarization0
Facilitating the Production of Well-tailored Video Summaries for Sharing on Social Media0
Fast Graph Sampling for Short Video Summarization using Gershgorin Disc Alignment0
FastPerson: Enhancing Video Learning through Effective Video Summarization that Preserves Linguistic and Visual Contexts0
ElasticPlay: Interactive Video Summarization with Dynamic Time Budgets0
FaVChat: Unlocking Fine-Grained Facail Video Understanding with Multimodal Large Language Models0
FrameRank: A Text Processing Approach to Video Summarization0
From Keyframes to Key Objects: Video Summarization by Representative Object Proposal Selection0
Hierarchical Multimodal Transformer to Summarize Videos0
HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization0
Generating Natural Language Summaries for Multimedia0
Global-and-Local Relative Position Embedding for Unsupervised Video Summarization0
Conditional Modeling Based Automatic Video Summarization0
EgoSonics: Generating Synchronized Audio for Silent Egocentric Videos0
Show:102550
← PrevPage 2 of 6Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PGL-SUMF1-score (Canonical)55.6Unverified
2RR-STGF1-score (Canonical)54.5Unverified
3DSNetF1-score (Canonical)53Unverified
4VASNetF1-score (Canonical)49.71Unverified
5M-AVSF1-score (Canonical)44.4Unverified
6CSTAKendall's Tau0.25Unverified
#ModelMetricClaimedVerifiedStatus
1RR-STGF1-score (Canonical)63Unverified
2DSNetF1-score (Canonical)62.1Unverified
3VASNetF1-score (Canonical)61.42Unverified
4M-AVSF1-score (Canonical)61Unverified
5PGL-SUMF1-score (Canonical)61Unverified
6CSTAKendall's Tau0.19Unverified
#ModelMetricClaimedVerifiedStatus
1Shotluck-Holmes (3.1B)CIDEr152.3Unverified
2Shotluck-Holmes (3.1B)CIDEr63.2Unverified
3SUM-shotCIDEr8.6Unverified
#ModelMetricClaimedVerifiedStatus
1EgoVLPv2F1 (avg)52.08Unverified
2EgoVLPF1 (avg)49.72Unverified
#ModelMetricClaimedVerifiedStatus
1PGL-SUMMAP (50%)61.6Unverified
#ModelMetricClaimedVerifiedStatus
1VTSUM-BLIP1 shot Micro-F123.5Unverified