SOTAVerified

Video Summarization

Video Summarization aims to generate a short synopsis that summarizes the video content by selecting its most informative and important parts. The produced summary is usually composed of a set of representative video frames (a.k.a. video key-frames), or video fragments (a.k.a. video key-fragments) that have been stitched in chronological order to form a shorter video. The former type of a video summary is known as video storyboard, and the latter type is known as video skim.

Source: Video Summarization Using Deep Neural Networks: A Survey Image credit: iJRASET

Papers

Showing 151200 of 280 papers

TitleStatusHype
Online Learnable Keyframe Extraction in Videos and its Application with Semantic Word Vector in Action Recognition0
Exploring global diverse attention via pairwise temporal relation for video summarization0
Multi-modal Summarization for Video-containing DocumentsCode1
Image Conditioned Keyframe-Based Video Summarization Using Object Detection0
Query Twice: Dual Mixture Attention Meta Learning for Video Summarization0
Global-and-Local Relative Position Embedding for Unsupervised Video Summarization0
Compare and Select: Video Summarization with Multi-Agent Reinforcement Learning0
Realistic Video Summarization through VISIOCITY: A New Benchmark and Evaluation Framework0
SumGraph: Video Summarization via Recursive Graph Modeling0
Submodular Maximization in Clean Linear Time0
Transforming Multi-Concept Attention into Video Summarization0
Ultrasound Video Summarization using Deep Reinforcement LearningCode1
A Survey on Patch-based Synthesis: GPU Implementation and Optimization0
Text Synopsis Generation for Egocentric Videos0
Query-controllable Video SummarizationCode1
Group Activity Recognition by Using Effective Multiple Modality Relation Representation With Temporal-Spatial Attention0
Convolutional Hierarchical Attention Network for Query-Focused Video SummarizationCode1
Weakly Supervised Video Summarization by Hierarchical Reinforcement Learning0
Unsupervised Video Summarization via Attention-Driven Adversarial LearningCode0
ILS-SUMM: Iterated Local Search for Unsupervised Video SummarizationCode0
An Attention-Based Speaker Naming Method for Online Adaptation in Non-Fixed Scenarios0
A Graph-based Ranking Approach to Extract Key-frames for Static Video Summarization0
Visual Summarization of Scholarly Videos using Word Embeddings and Keyphrase Extraction0
Non-Monotone Submodular Maximization with Multiple Knapsacks in Static and Dynamic Settings0
Comprehensive Video Understanding: Video summarization with content-based video recommender design0
A Stepwise, Label-based Approach for Improving the Adversarial Training in Unsupervised Video SummarizationCode0
TruNet: Short Videos Generation from Long Videos via Story-Preserving Truncation0
Multi-modal Deep Analysis for Multimedia0
Unsupervised video summarization framework using keyframe extraction and video skimmingCode0
Video Skimming: Taxonomy and Comprehensive Survey0
Meta Learning for Task-Driven Video Summarization0
A Novel Approach for Robust Multi Human Action Recognition and Summarization based on 3D Convolutional Neural Networks0
Attention is all you need for Videos: Self-attention based Video Summarization using Universal Transformers0
Hierarchical Recurrent Neural Network for Video Summarization0
A General Framework for Edited Video and Raw Video Summarization0
NLP Driven Ensemble Based Automatic Subtitle Generation and Semantic Video Summarization Technique0
Video Object Segmentation and Tracking: A Survey0
Cycle-SUM: Cycle-consistent Adversarial LSTM Networks for Unsupervised Video Summarization0
FrameRank: A Text Processing Approach to Video Summarization0
Rethinking the Evaluation of Video SummariesCode0
Video Summarization via Actionness Ranking0
A Mobile Robot Generating Video Summaries of Seniors' Indoor Activities0
Human Pose Estimation using Motion Priors and Ensemble Models0
Real-time Video Summarization on Commodity Hardware0
Demystifying Multi-Faceted Video Summarization: Tradeoff Between Diversity,Representation, Coverage and Importance0
Summarizing Videos with AttentionCode0
SUSiNet: See, Understand and Summarize it0
Sequence-to-Segment Networks for Segment Detection0
Multi-Stream Dynamic Video SummarizationCode0
Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer VisionCode0
Show:102550
← PrevPage 4 of 6Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PGL-SUMF1-score (Canonical)55.6Unverified
2RR-STGF1-score (Canonical)54.5Unverified
3DSNetF1-score (Canonical)53Unverified
4VASNetF1-score (Canonical)49.71Unverified
5M-AVSF1-score (Canonical)44.4Unverified
6CSTAKendall's Tau0.25Unverified
#ModelMetricClaimedVerifiedStatus
1RR-STGF1-score (Canonical)63Unverified
2DSNetF1-score (Canonical)62.1Unverified
3VASNetF1-score (Canonical)61.42Unverified
4M-AVSF1-score (Canonical)61Unverified
5PGL-SUMF1-score (Canonical)61Unverified
6CSTAKendall's Tau0.19Unverified
#ModelMetricClaimedVerifiedStatus
1Shotluck-Holmes (3.1B)CIDEr152.3Unverified
2Shotluck-Holmes (3.1B)CIDEr63.2Unverified
3SUM-shotCIDEr8.6Unverified
#ModelMetricClaimedVerifiedStatus
1EgoVLPv2F1 (avg)52.08Unverified
2EgoVLPF1 (avg)49.72Unverified
#ModelMetricClaimedVerifiedStatus
1PGL-SUMMAP (50%)61.6Unverified
#ModelMetricClaimedVerifiedStatus
1VTSUM-BLIP1 shot Micro-F123.5Unverified