SOTAVerified

Moment Retrieval

Moment retrieval can de defined as the task of "localizing moments in a video given a user query".

Description from: QVHIGHLIGHTS: Detecting Moments and Highlights in Videos via Natural Language Queries

Image credit: QVHIGHLIGHTS: Detecting Moments and Highlights in Videos via Natural Language Queries

Papers

Showing 51100 of 132 papers

TitleStatusHype
Deconfounded Video Moment Retrieval with Causal InterventionCode1
Watch Video, Catch Keyword: Context-aware Keyword Attention for Moment Retrieval and Highlight DetectionCode1
Background-aware Moment Detection for Video Moment RetrievalCode1
Partially Relevant Video RetrievalCode1
Video Moment Retrieval from Text Queries via Single Frame AnnotationCode1
Retrieval Augmented Generation Evaluation for Health Documents0
2DP-2MRC: 2-Dimensional Pointer-based Machine Reading Comprehension Method for Multimodal Moment Retrieval0
Agent-based Video Trimming0
A Survey on Video Moment Localization0
AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval0
Coarse to Fine: Video Retrieval before Moment Localization0
Context-Enhanced Video Moment Retrieval with Large Language Models0
Cross-Lingual Cross-Modal Consolidation for Effective Multilingual Video Corpus Moment Retrieval0
DAVE: Diverse Atomic Visual Elements Dataset with High Representation of Vulnerable Road Users in Complex and Unpredictable Environments0
DeSPITE: Exploring Contrastive Deep Skeleton-Pointcloud-IMU-Text Embeddings for Advanced Point Cloud Human Activity Understanding0
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection0
Disentangle and denoise: Tackling context misalignment for video moment retrieval0
D&M: Enriching E-commerce Videos with Sound Effects by Key Moment Detection and SFX Matching0
EAGLE: Egocentric AGgregated Language-video Engine0
EA-VTR: Event-Aware Video-Text Retrieval0
Event-aware Video Corpus Moment Retrieval0
Faster Video Moment Retrieval with Point-Level Supervision0
Fast Video Moment Retrieval0
FedVMR: A New Federated Learning method for Video Moment Retrieval0
Generating Adjacency Matrix for Video Relocalization0
Generative Video Diffusion for Unseen Cross-Domain Video Moment Retrieval0
GPTSee: Enhancing Moment Retrieval and Highlight Detection via Description-Based Similarity Features0
Graph Neural Network for Video Relocalization0
Grounding-MD: Grounded Video-language Pre-training for Open-World Moment Detection0
Hybrid-Learning Video Moment Retrieval across Multi-Domain Labels0
Interactive Video Corpus Moment Retrieval using Reinforcement Learning0
Language Guided Networks for Cross-modal Moment Retrieval0
Leveraging Generative Language Models for Weakly Supervised Sentence Component Analysis in Video-Language Joint Learning0
MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment0
MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment Retrieval0
MomentSeeker: A Task-Oriented Benchmark For Long-Video Moment Retrieval0
Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval0
Multi-modal Fusion and Query Refinement Network for Video Moment Retrieval and Highlight Detection0
Multi-Modal Relational Graph for Cross-Modal Video Moment Retrieval0
Multi-scale 2D Representation Learning for weakly-supervised moment retrieval0
Multi-sentence Video Grounding for Long Video Generation0
Multi-video Moment Ranking with Multimodal Clue0
QD-VMR: Query Debiasing with Contextual Understanding Enhancement for Video Moment Retrieval0
Query-centric Audio-Visual Cognition Network for Moment Retrieval, Segmentation and Step-Captioning0
SCANet: Scene Complexity Aware Network for Weakly-Supervised Video Moment Retrieval0
SLVideo: A Sign Language Video Moment Retrieval Framework0
Temporal Perceiving Video-Language Pre-training0
Text-based Localization of Moments in a Video Corpus0
The Devil is in the Spurious Correlation: Boosting Moment Retrieval via Temporal Dynamic Learning0
Temporal Sentence Grounding in Videos: A Survey and Future Directions0
Show:102550
← PrevPage 2 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1UnLoc-LR@1 IoU=0.566.1Unverified
2UnLoc-BR@1 IoU=0.564.5Unverified
3DenoiseLocR@1 IoU=0.559.27Unverified
4SG-DETR (w/ PT)mAP58.8Unverified
5SG-DETRmAP54.1Unverified
6LLaVA-MRmAP52.73Unverified
7FlashVTGmAP52Unverified
8InternVideo2-6BmAP49.24Unverified
9CG-DETR (w/ PT)mAP47.97Unverified
10VideoLights-B-ptmAP47.94Unverified
#ModelMetricClaimedVerifiedStatus
1SG-DETR (w/ PT)R@1 IoU=0.571.1Unverified
2LLaVA-MRR@1 IoU=0.570.65Unverified
3FlashVTGR@1 IoU=0.570.32Unverified
4SG-DETRR@1 IoU=0.570.2Unverified
5InternVideo2-6BR@1 IoU=0.570.03Unverified
6InternVideo2-1BR@1 IoU=0.568.36Unverified
7VideoChat-T (FT)R@1 IoU=0.567.1Unverified
8UniMD+Sync.R@1 IoU=0.563.98Unverified
9LD-DETRR@1 IoU=0.562.58Unverified
10VideoLights-B-ptR@1 IoU=0.561.96Unverified