SOTAVerified

Video Summarization

Video Summarization aims to generate a short synopsis that summarizes the video content by selecting its most informative and important parts. The produced summary is usually composed of a set of representative video frames (a.k.a. video key-frames), or video fragments (a.k.a. video key-fragments) that have been stitched in chronological order to form a shorter video. The former type of a video summary is known as video storyboard, and the latter type is known as video skim.

Source: Video Summarization Using Deep Neural Networks: A Survey Image credit: iJRASET

Papers

Showing 76100 of 280 papers

TitleStatusHype
How Local is the Local Diversity? Reinforcing Sequential Determinantal Point Processes with Dynamic Ground Sets for Supervised Video Summarization0
CFSum: A Transformer-Based Multi-Modal Video Summarization Framework With Coarse-Fine Fusion0
A Dataset and Preliminary Results for Umpire Pose Detection Using SVM Classification of Deep Features0
Hierarchical Recurrent Neural Network for Video Summarization0
Causal Video Summarizer for Video Exploration0
Enhancing Video Summarization via Vision-Language Embedding0
Causalainer: Causal Explainer for Automatic Video Summarization0
Enhancing Video Memorability Prediction with Text-Motion Cross-modal Contrastive Loss and Its Application in Video Summarization0
A Novel Technique for Evidence based Conditional Inference in Deep Neural Networks via Latent Feature Perturbation0
Exploring Efficient Foundational Multi-modal Models for Video Summarization0
Exploring global diverse attention via pairwise temporal relation for video summarization0
Exploring Global Diversity and Local Context for Video Summarization0
Facilitating the Production of Well-tailored Video Summaries for Sharing on Social Media0
Fast Graph Sampling for Short Video Summarization using Gershgorin Disc Alignment0
FastPerson: Enhancing Video Learning through Effective Video Summarization that Preserves Linguistic and Visual Contexts0
ElasticPlay: Interactive Video Summarization with Dynamic Time Budgets0
FaVChat: Unlocking Fine-Grained Facail Video Understanding with Multimodal Large Language Models0
FrameRank: A Text Processing Approach to Video Summarization0
From Keyframes to Key Objects: Video Summarization by Representative Object Proposal Selection0
Hierarchical Multimodal Transformer to Summarize Videos0
HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization0
Generating Natural Language Summaries for Multimedia0
Global-and-Local Relative Position Embedding for Unsupervised Video Summarization0
Conditional Modeling Based Automatic Video Summarization0
EgoSonics: Generating Synchronized Audio for Silent Egocentric Videos0
Show:102550
← PrevPage 4 of 12Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PGL-SUMF1-score (Canonical)55.6Unverified
2RR-STGF1-score (Canonical)54.5Unverified
3DSNetF1-score (Canonical)53Unverified
4VASNetF1-score (Canonical)49.71Unverified
5M-AVSF1-score (Canonical)44.4Unverified
6CSTAKendall's Tau0.25Unverified
#ModelMetricClaimedVerifiedStatus
1RR-STGF1-score (Canonical)63Unverified
2DSNetF1-score (Canonical)62.1Unverified
3VASNetF1-score (Canonical)61.42Unverified
4M-AVSF1-score (Canonical)61Unverified
5PGL-SUMF1-score (Canonical)61Unverified
6CSTAKendall's Tau0.19Unverified
#ModelMetricClaimedVerifiedStatus
1Shotluck-Holmes (3.1B)CIDEr152.3Unverified
2Shotluck-Holmes (3.1B)CIDEr63.2Unverified
3SUM-shotCIDEr8.6Unverified
#ModelMetricClaimedVerifiedStatus
1EgoVLPv2F1 (avg)52.08Unverified
2EgoVLPF1 (avg)49.72Unverified
#ModelMetricClaimedVerifiedStatus
1PGL-SUMMAP (50%)61.6Unverified
#ModelMetricClaimedVerifiedStatus
1VTSUM-BLIP1 shot Micro-F123.5Unverified