SOTAVerified

Video Summarization

Video Summarization aims to generate a short synopsis that summarizes the video content by selecting its most informative and important parts. The produced summary is usually composed of a set of representative video frames (a.k.a. video key-frames), or video fragments (a.k.a. video key-fragments) that have been stitched in chronological order to form a shorter video. The former type of a video summary is known as video storyboard, and the latter type is known as video skim.

Source: Video Summarization Using Deep Neural Networks: A Survey Image credit: iJRASET

Papers

Showing 101125 of 280 papers

TitleStatusHype
ElasticPlay: Interactive Video Summarization with Dynamic Time Budgets0
Hierarchical Recurrent Neural Network for Video Summarization0
Long-Term Identity-Aware Multi-Person Tracking for Surveillance Video Summarization0
Masked Autoencoder for Unsupervised Video Summarization0
Motion-Based Sign Language Video Summarization using Curvature and Torsion0
How Local is the Local Diversity? Reinforcing Sequential Determinantal Point Processes with Dynamic Ground Sets for Supervised Video Summarization0
HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization0
Human Pose Estimation using Motion Priors and Ensemble Models0
CSTA: CNN-based Spatiotemporal Attention for Video Summarization0
Image Conditioned Keyframe-Based Video Summarization Using Object Detection0
EgoSonics: Generating Synchronized Audio for Silent Egocentric Videos0
Cycle-SUM: Cycle-consistent Adversarial LSTM Networks for Unsupervised Video Summarization0
Efficient Video Summarization Framework using EEG and Eye-tracking Signals0
DeepQAMVS: Query-Aware Hierarchical Pointer Networks for Multi-Video Summarization0
Beyond the Frame: Single and mutilple video summarization method with user-defined length0
Joint Summarization of Large-scale Collections of Web Images and Videos for Storyline Reconstruction0
Joint Video Summarization and Moment Localization by Cross-Task Sample Transfer0
Key Frame Extraction with Attention Based Deep Neural Networks0
A New Action Recognition Framework for Video Highlights Summarization in Sporting Events0
Large-Margin Determinantal Point Processes0
Large Model based Sequential Keyframe Extraction for Video Summarization0
Large-Scale Video Summarization Using Web-Image Priors0
EDSNet: Efficient-DSNet for Video Summarization0
Learning to Summarize Videos by Contrasting Clips0
Dynamic Non-monotone Submodular Maximization0
Show:102550
← PrevPage 5 of 12Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PGL-SUMF1-score (Canonical)55.6Unverified
2RR-STGF1-score (Canonical)54.5Unverified
3DSNetF1-score (Canonical)53Unverified
4VASNetF1-score (Canonical)49.71Unverified
5M-AVSF1-score (Canonical)44.4Unverified
6CSTAKendall's Tau0.25Unverified
#ModelMetricClaimedVerifiedStatus
1RR-STGF1-score (Canonical)63Unverified
2DSNetF1-score (Canonical)62.1Unverified
3VASNetF1-score (Canonical)61.42Unverified
4M-AVSF1-score (Canonical)61Unverified
5PGL-SUMF1-score (Canonical)61Unverified
6CSTAKendall's Tau0.19Unverified
#ModelMetricClaimedVerifiedStatus
1Shotluck-Holmes (3.1B)CIDEr152.3Unverified
2Shotluck-Holmes (3.1B)CIDEr63.2Unverified
3SUM-shotCIDEr8.6Unverified
#ModelMetricClaimedVerifiedStatus
1EgoVLPv2F1 (avg)52.08Unverified
2EgoVLPF1 (avg)49.72Unverified
#ModelMetricClaimedVerifiedStatus
1PGL-SUMMAP (50%)61.6Unverified
#ModelMetricClaimedVerifiedStatus
1VTSUM-BLIP1 shot Micro-F123.5Unverified