SOTAVerified

Video-Text Retrieval

Video-Text retrieval requires understanding of both video and language together. Therefore it's different to video retrieval task.

Papers

Showing 8190 of 111 papers

TitleStatusHype
NAVERO: Unlocking Fine-Grained Semantics for Video-Language Compositionality0
OmniVL:One Foundation Model for Image-Language and Video-Language Tasks0
Rethinking Noisy Video-Text Retrieval via Relation-aware Alignment0
RETTA: Retrieval-Enhanced Test-Time Adaptation for Zero-Shot Video Captioning0
Retrieving and Highlighting Action with Spatiotemporal Reference0
Stacked Convolutional Deep Encoding Network for Video-Text Retrieval0
Synopses of Movie Narratives: a Video-Language Dataset for Story Understanding0
Synopses of Movie Narratives: a Video-Language Dataset for Story Understanding0
Tagging before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval0
Text-Adaptive Multiple Visual Prototype Matching for Video-Text Retrieval0
Show:102550
← PrevPage 9 of 12Next →

No leaderboard results yet.