SOTAVerified

Video Summarization

Video Summarization aims to generate a short synopsis that summarizes the video content by selecting its most informative and important parts. The produced summary is usually composed of a set of representative video frames (a.k.a. video key-frames), or video fragments (a.k.a. video key-fragments) that have been stitched in chronological order to form a shorter video. The former type of a video summary is known as video storyboard, and the latter type is known as video skim.

Source: Video Summarization Using Deep Neural Networks: A Survey Image credit: iJRASET

Papers

Showing 110 of 280 papers

TitleStatusHype
TRIM: A Self-Supervised Video Summarization Framework Maximizing Temporal Relative Information and Representativeness0
Prompts to Summaries: Zero-Shot Language-Guided Video Summarization0
MF2Summ: Multimodal Fusion for Video Summarization with Temporal Alignment0
Enhancing Video Memorability Prediction with Text-Motion Cross-modal Contrastive Loss and Its Application in Video Summarization0
TriPSS: A Tri-Modal Keyframe Extraction Framework Using Perceptual, Structural, and Semantic Representations0
Unsupervised Transcript-assisted Video Summarization and Highlight Detection0
REGen: Multimodal Retrieval-Embedded Generation for Long-to-Short Video Editing0
SD-VSum: A Method and Dataset for Script-Driven Video SummarizationCode0
Video Summarization with Large Language Models0
Automatic Detection of Intro and Credits in Video using CLIP and Multihead Attention0
Show:102550
← PrevPage 1 of 28Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1VTSUM-BLIP1 shot Micro-F123.5Unverified