SOTAVerified

Highlight Detection

https://youtu.be/pJ0auP7dbcY?si=vSiZevfJ57YUKC2q

Papers

Showing 125 of 78 papers

TitleStatusHype
Lighthouse: A User-Friendly Library for Reproducible Video Moment Retrieval and Highlight DetectionCode3
Number it: Temporal Grounding Videos like Flipping MangaCode2
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded TuningCode2
Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment RetrievalCode2
VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal GroundingCode2
TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight DetectionCode2
TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video UnderstandingCode2
Correlation-Guided Query-Dependency Calibration for Video Temporal GroundingCode2
UniVTG: Towards Unified Video-Language Temporal GroundingCode2
Query-Dependent Video Representation for Moment Retrieval and Highlight DetectionCode2
UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight DetectionCode2
TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in ActionCode1
LD-DETR: Loop Decoder DEtection TRansformer for Video Moment Retrieval and Highlight DetectionCode1
Watch Video, Catch Keyword: Context-aware Keyword Attention for Moment Retrieval and Highlight DetectionCode1
FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal GroundingCode1
Video Repurposing from User Generated Content: A Large-scale Dataset and BenchmarkCode1
VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment RetrievalCode1
VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction FormatCode1
Saliency-Guided DETR for Moment Retrieval and Highlight DetectionCode1
Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight DetectionCode1
Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight DetectionCode1
Joint Moment Retrieval and Highlight Detection Via Natural Language QueriesCode1
MH-DETR: Video Moment and Highlight Detection with Cross-modal TransformerCode1
Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in MoviesCode1
SpecSeg Network for Specular Highlight Detection and Segmentation in Real-World ImagesCode1
Show:102550
← PrevPage 1 of 4Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1QD-DETR (only Video w/ PT)Hit@161.91Unverified
2HL-CLIPmAP41.94Unverified
3NumPromAP40.54Unverified
4UMT (w. PT)mAP39.12Unverified
5QD-DETRmAP39.04Unverified
6UniVTGmAP38.2Unverified
7UMTmAP38.18Unverified
8Moment-DETR w/ PTmAP37.43Unverified
9VideoChat-T (FT)mAP27Unverified
10VideoChat-T (ZS)mAP26.5Unverified
#ModelMetricClaimedVerifiedStatus
1FlashVTGmAP88Unverified
2SG-DETRmAP87.1Unverified
3CG-DETRmAP86.8Unverified
4QD-DETRmAP86.6Unverified
5UVCOM (train from scratch)mAP86.3Unverified
6QD-DETR (only Video)mAP85Unverified
7UMTmAP83.1Unverified
#ModelMetricClaimedVerifiedStatus
1SG-DETR (w/ PT)mAP78Unverified
2UVCOMmAP77.4Unverified
3SG-DETRmAP76.7Unverified
4CG-DETRmAP75.9Unverified
5FlashVTGmAP75.4Unverified
6LLMEPETmAP75.3Unverified
7UMTmAP74.9Unverified