SOTAVerified

Highlight Detection

https://youtu.be/pJ0auP7dbcY?si=vSiZevfJ57YUKC2q

Papers

Showing 150 of 78 papers

TitleStatusHype
Unsupervised Transcript-assisted Video Summarization and Highlight Detection0
Rhapsody: A Dataset for Highlight Detection in PodcastsCode0
Gameplay Highlights Generation0
TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in ActionCode1
Automatic Detection of Intro and Credits in Video using CLIP and Multihead Attention0
Multi-modal Fusion and Query Refinement Network for Video Moment Retrieval and Highlight Detection0
LD-DETR: Loop Decoder DEtection TRansformer for Video Moment Retrieval and Highlight DetectionCode1
Watch Video, Catch Keyword: Context-aware Keyword Attention for Moment Retrieval and Highlight DetectionCode1
FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal GroundingCode1
Video Repurposing from User Generated Content: A Large-scale Dataset and BenchmarkCode1
Agent-based Video Trimming0
Video LLMs for Temporal Reasoning in Long Videos0
VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment RetrievalCode1
VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction FormatCode1
Number it: Temporal Grounding Videos like Flipping MangaCode2
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded TuningCode2
Saliency-Guided DETR for Moment Retrieval and Highlight DetectionCode1
D&M: Enriching E-commerce Videos with Sound Effects by Key Moment Detection and SFX Matching0
Lighthouse: A User-Friendly Library for Reproducible Video Moment Retrieval and Highlight DetectionCode3
Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment RetrievalCode2
Unsupervised Video Highlight Detection by Learning from Audio and Visual Recurrence0
A Multimodal Transformer for Live Streaming Highlight Prediction0
VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal GroundingCode2
Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight DetectionCode1
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal GroundingCode0
Unleash the Potential of CLIP for Video Highlight DetectionCode0
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal GroundingCode0
Unsupervised Modality-Transferable Video Highlight Detection with Representation Activation Sequence Learning0
GPTSee: Enhancing Moment Retrieval and Highlight Detection via Description-Based Similarity Features0
TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight DetectionCode2
TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video UnderstandingCode2
Joint network for specular highlight detection and adversarial generation of specular-free images trained with polarimetric dataCode0
Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight DetectionCode1
Correlation-Guided Query-Dependency Calibration for Video Temporal GroundingCode2
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection0
UniVTG: Towards Unified Video-Language Temporal GroundingCode2
Joint Moment Retrieval and Highlight Detection Via Natural Language QueriesCode1
MH-DETR: Video Moment and Highlight Detection with Cross-modal TransformerCode1
Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in MoviesCode1
Query-Dependent Video Representation for Moment Retrieval and Highlight DetectionCode2
SpecSeg Network for Specular Highlight Detection and Segmentation in Real-World ImagesCode1
M2-Net: Multi-stages Specular Highlight Detection and Removal in Multi-scenesCode1
Show Me What I Like: Detecting User-Specific Video Highlights Using Content-Based Multi-Head Attention0
Probing Visual-Audio Representation for Video Highlight Detection via Hard-Pairs Guided Contrastive Learning0
0/1 Deep Neural Networks via Block Coordinate Descent0
AntPivot: Livestream Highlight Detection via Hierarchical Attention Mechanism0
Learning Pixel-Level Distinctions for Video Highlight Detection0
UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight DetectionCode2
Smart Director: An Event-Driven Directing System for Live Broadcasting0
Contrastive Learning for Unsupervised Video Highlight Detection0
Show:102550
← PrevPage 1 of 2Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1QD-DETR (only Video w/ PT)Hit@161.91Unverified
2HL-CLIPmAP41.94Unverified
3NumPromAP40.54Unverified
4UMT (w. PT)mAP39.12Unverified
5QD-DETRmAP39.04Unverified
6UniVTGmAP38.2Unverified
7UMTmAP38.18Unverified
8Moment-DETR w/ PTmAP37.43Unverified
9VideoChat-T (FT)mAP27Unverified
10VideoChat-T (ZS)mAP26.5Unverified
#ModelMetricClaimedVerifiedStatus
1FlashVTGmAP88Unverified
2SG-DETRmAP87.1Unverified
3CG-DETRmAP86.8Unverified
4QD-DETRmAP86.6Unverified
5UVCOM (train from scratch)mAP86.3Unverified
6QD-DETR (only Video)mAP85Unverified
7UMTmAP83.1Unverified
#ModelMetricClaimedVerifiedStatus
1SG-DETR (w/ PT)mAP78Unverified
2UVCOMmAP77.4Unverified
3SG-DETRmAP76.7Unverified
4CG-DETRmAP75.9Unverified
5FlashVTGmAP75.4Unverified
6LLMEPETmAP75.3Unverified
7UMTmAP74.9Unverified