SOTAVerified

Video Summarization

Video Summarization aims to generate a short synopsis that summarizes the video content by selecting its most informative and important parts. The produced summary is usually composed of a set of representative video frames (a.k.a. video key-frames), or video fragments (a.k.a. video key-fragments) that have been stitched in chronological order to form a shorter video. The former type of a video summary is known as video storyboard, and the latter type is known as video skim.

Source: Video Summarization Using Deep Neural Networks: A Survey Image credit: iJRASET

Papers

Showing 51100 of 280 papers

TitleStatusHype
Visual Question Answering: which investigated applications?Code0
Vis-DSS: An Open-Source toolkit for Visual Data Selection and SummarizationCode0
GPT2MVS: Generative Pre-trained Transformer-2 for Multi-modal Video SummarizationCode0
CSTA: CNN-based Spatiotemporal Attention for Video SummarizationCode0
Video Summarization with Long Short-term MemoryCode0
What Is That Talk About? A Video-to-Text Summarization Dataset for Scientific PresentationsCode0
Video Summarization: Towards Entity-Aware CaptionsCode0
Adaptive frame selection in two dimensional convolutional neural network action recognitionCode0
Video Summarization using Deep Semantic FeaturesCode0
Unsupervised video summarization framework using keyframe extraction and video skimmingCode0
Weakly-supervised Video Summarization using Variational Encoder-Decoder and Web PriorCode0
A Human-Annotated Video Dataset for Training and Evaluation of 360-Degree Video Summarization MethodsCode0
Unsupervised Video Summarization via Iterative Training and Simplified GANCode0
Unsupervised Video Summarization With Adversarial LSTM NetworksCode0
An Integrated System for Spatio-Temporal Summarization of 360-degrees VideosCode0
UBiSS: A Unified Framework for Bimodal Semantic Summarization of VideosCode0
Unsupervised multi-latent space reinforcement learning framework for video summarization in ultrasound imagingCode0
DeVAn: Dense Video Annotation for Video-Language ModelsCode0
CLIP-It! Language-Guided Video SummarizationCode0
ERA: Entity Relationship Aware Video Summarization with Wasserstein GANCode0
Temporal Tessellation: A Unified Approach for Video AnalysisCode0
An Integrated Framework for Multi-Granular Explanation of Video SummarizationCode0
Siamese Tracking with Lingual Object ConstraintsCode0
Spatio-Temporal Stability Analysis in Satellite Image Times SeriesCode0
Towards Practical and Efficient Long Video SummaryCode0
Rethinking the Evaluation of Video SummariesCode0
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal GroundingCode0
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal GroundingCode0
SD-VSum: A Method and Dataset for Script-Driven Video SummarizationCode0
Multi-Stream Dynamic Video SummarizationCode0
Query-adaptive Video Summarization via Quality-aware Relevance EstimationCode0
Enhancing Video Summarization with Context AwarenessCode0
A Challenging Multimodal Video Summary: Simultaneously Extracting and Generating Keyframe-Caption Pairs from VideoCode0
Does SpatioTemporal information benefit Two video summarization benchmarks?Code0
Integrate the temporal scheme for unsupervised video summarization via attention mechanismCode0
Summarizing Videos with AttentionCode0
Cluster-based Video Summarization with Temporal Context AwarenessCode0
Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer VisionCode0
ILS-SUMM: Iterated Local Search for Unsupervised Video SummarizationCode0
SELF-VS: Self-supervised Encoding Learning For Video SummarizationCode0
Attention is all you need for Videos: Self-attention based Video Summarization using Universal Transformers0
Discovery of Shared Semantic Spaces for Multi-Scene Video Query and Summarization0
Detecting Engagement in Egocentric Video0
A Survey on Recent Advances of Computer Vision Algorithms for Egocentric Video0
A Multi-stage deep architecture for summary generation of soccer videos0
Demystifying Multi-Faceted Video Summarization: Tradeoff Between Diversity,Representation, Coverage and Importance0
DeepQAMVS: Query-Aware Hierarchical Pointer Networks for Multi-Video Summarization0
Cycle-SUM: Cycle-consistent Adversarial LSTM Networks for Unsupervised Video Summarization0
A Survey on Patch-based Synthesis: GPU Implementation and Optimization0
A Memory Network Approach for Story-Based Temporal Summarization of 360° Videos0
Show:102550
← PrevPage 2 of 6Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PGL-SUMF1-score (Canonical)55.6Unverified
2RR-STGF1-score (Canonical)54.5Unverified
3DSNetF1-score (Canonical)53Unverified
4VASNetF1-score (Canonical)49.71Unverified
5M-AVSF1-score (Canonical)44.4Unverified
6CSTAKendall's Tau0.25Unverified
#ModelMetricClaimedVerifiedStatus
1RR-STGF1-score (Canonical)63Unverified
2DSNetF1-score (Canonical)62.1Unverified
3VASNetF1-score (Canonical)61.42Unverified
4M-AVSF1-score (Canonical)61Unverified
5PGL-SUMF1-score (Canonical)61Unverified
6CSTAKendall's Tau0.19Unverified
#ModelMetricClaimedVerifiedStatus
1Shotluck-Holmes (3.1B)CIDEr152.3Unverified
2Shotluck-Holmes (3.1B)CIDEr63.2Unverified
3SUM-shotCIDEr8.6Unverified
#ModelMetricClaimedVerifiedStatus
1EgoVLPv2F1 (avg)52.08Unverified
2EgoVLPF1 (avg)49.72Unverified
#ModelMetricClaimedVerifiedStatus
1PGL-SUMMAP (50%)61.6Unverified
#ModelMetricClaimedVerifiedStatus
1VTSUM-BLIP1 shot Micro-F123.5Unverified