SOTAVerified

Video-Text Retrieval

Video-Text retrieval requires understanding of both video and language together. Therefore it's different to video retrieval task.

Papers

Showing 7180 of 111 papers

TitleStatusHype
VTC: Improving Video-Text Retrieval with User CommentsCode1
Vision-Language Pre-training: Basics, Recent Advances, and Future TrendsCode3
TokenFlow: Rethinking Fine-grained Cross-modal Alignment in Vision-Language Retrieval0
Unified Loss of Pair Similarity Optimization for Vision-Language Retrieval0
Text-Adaptive Multiple Visual Prototype Matching for Video-Text Retrieval0
OmniVL:One Foundation Model for Image-Language and Video-Language Tasks0
CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation AlignmentCode2
Boosting Video-Text Retrieval with Explicit High-Level Semantics0
X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text RetrievalCode1
LaT: Latent Translation with Cycle-Consistency for Video-Text Retrieval0
Show:102550
← PrevPage 8 of 12Next →

No leaderboard results yet.