SOTAVerified

Video Summarization

Video Summarization aims to generate a short synopsis that summarizes the video content by selecting its most informative and important parts. The produced summary is usually composed of a set of representative video frames (a.k.a. video key-frames), or video fragments (a.k.a. video key-fragments) that have been stitched in chronological order to form a shorter video. The former type of a video summary is known as video storyboard, and the latter type is known as video skim.

Source: Video Summarization Using Deep Neural Networks: A Survey Image credit: iJRASET

Papers

Showing 151200 of 280 papers

TitleStatusHype
Summary Transfer: Exemplar-based Subset Selection for Video Summarization0
SUSiNet: See, Understand and Summarize it0
Temporally Coherent Bayesian Models for Entity Discovery in Videos by Tracklet Clustering0
Text Synopsis Generation for Egocentric Videos0
The Power of Subsampling in Submodular Maximization0
TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency0
Transforming Multi-Concept Attention into Video Summarization0
TRIM: A Self-Supervised Video Summarization Framework Maximizing Temporal Relative Information and Representativeness0
TriPSS: A Tri-Modal Keyframe Extraction Framework Using Perceptual, Structural, and Semantic Representations0
TruNet: Short Videos Generation from Long Videos via Story-Preserving Truncation0
TVSum: Summarizing Web Videos Using Titles0
Understanding the Predictability of Gesture Parameters from Speech and their Perceptual Importance0
Unsupervised Object-Level Video Summarization with Online Motion Auto-Encoder0
Unsupervised Transcript-assisted Video Summarization and Highlight Detection0
Unsupervised Video Summarization via Reinforcement Learning and a Trained Evaluator0
Unsupervised Video Summarization with a Convolutional Attentive Adversarial Network0
Use of Affective Visual Information for Summarization of Human-Centric Videos0
V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning0
Video Co-Summarization: Video Summarization by Visual Co-Occurrence0
Video Object Segmentation and Tracking: A Survey0
Video Skimming: Taxonomy and Comprehensive Survey0
Video Summarization by Learning Submodular Mixtures of Objectives0
Video Summarization in a Multi-View Camera Network0
Video Summarization Overview0
Video Summarization: Study of various techniques0
Video Summarization Techniques: A Comprehensive Review0
A Mobile Robot Generating Video Summaries of Seniors' Indoor Activities0
Video Summarization through Reinforcement Learning with a 3D Spatio-Temporal U-Net0
Video Summarization Using Deep Neural Networks: A Survey0
Video Summarization using Denoising Diffusion Probabilistic Model0
Video Summarization Using Fully Convolutional Sequence Networks0
Video Summarization via Actionness Ranking0
Video Summarization with Attention-Based Encoder-Decoder Networks0
Video Summarization with Large Language Models0
Video-Teller: Enhancing Cross-Modal Generation with Fusion and Decoupling0
Viewpoint-aware Video Summarization0
Visual Recognition by Counting Instances: A Multi-Instance Cardinality Potential Kernel0
Visual Summarization of Scholarly Videos using Word Embeddings and Keyphrase Extraction0
VSCAN: An Enhanced Video Summarization using Density-based Spatial Clustering0
Weakly Supervised Video Summarization by Hierarchical Reinforcement Learning0
Recognizing Micro-Actions and Reactions From Paired Egocentric Videos0
A Dataset and Preliminary Results for Umpire Pose Detection Using SVM Classification of Deep Features0
A Framework towards Domain Specific Video Summarization0
A General Framework for Edited Video and Raw Video Summarization0
Agent-based Video Trimming0
A Graph-based Ranking Approach to Extract Key-frames for Static Video Summarization0
A Memory Network Approach for Story-based Temporal Summarization of 360° Videos0
A Memory Network Approach for Story-Based Temporal Summarization of 360° Videos0
A Multi-stage deep architecture for summary generation of soccer videos0
An Attention-Based Speaker Naming Method for Online Adaptation in Non-Fixed Scenarios0
Show:102550
← PrevPage 4 of 6Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PGL-SUMF1-score (Canonical)55.6Unverified
2RR-STGF1-score (Canonical)54.5Unverified
3DSNetF1-score (Canonical)53Unverified
4VASNetF1-score (Canonical)49.71Unverified
5M-AVSF1-score (Canonical)44.4Unverified
6CSTAKendall's Tau0.25Unverified
#ModelMetricClaimedVerifiedStatus
1RR-STGF1-score (Canonical)63Unverified
2DSNetF1-score (Canonical)62.1Unverified
3VASNetF1-score (Canonical)61.42Unverified
4M-AVSF1-score (Canonical)61Unverified
5PGL-SUMF1-score (Canonical)61Unverified
6CSTAKendall's Tau0.19Unverified
#ModelMetricClaimedVerifiedStatus
1Shotluck-Holmes (3.1B)CIDEr152.3Unverified
2Shotluck-Holmes (3.1B)CIDEr63.2Unverified
3SUM-shotCIDEr8.6Unverified
#ModelMetricClaimedVerifiedStatus
1EgoVLPv2F1 (avg)52.08Unverified
2EgoVLPF1 (avg)49.72Unverified
#ModelMetricClaimedVerifiedStatus
1PGL-SUMMAP (50%)61.6Unverified
#ModelMetricClaimedVerifiedStatus
1VTSUM-BLIP1 shot Micro-F123.5Unverified