SOTAVerified

Video-Text Retrieval

Video-Text retrieval requires understanding of both video and language together. Therefore it's different to video retrieval task.

Papers

Showing 101111 of 111 papers

TitleStatusHype
ViSeRet: A simple yet effective approach to moment retrieval via fine-grained video segmentation0
CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations0
Learning Context-Adapted Video-Text Retrieval by Attending to User Comments0
Memory Enhanced Embedding Learning for Cross-Modal Video-Text Retrieval0
HiT: Hierarchical Transformer with Momentum Contrast for Video-Text Retrieval0
Rudder: A Cross Lingual Video and Text Retrieval DatasetCode0
Exploiting Visual Semantic Reasoning for Video-Text Retrieval0
Retrieving and Highlighting Action with Spatiotemporal Reference0
Stacked Convolutional Deep Encoding Network for Video-Text Retrieval0
Deep Semantic Multimodal Hashing Network for Scalable Image-Text and Video-Text Retrievals0
Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text RetrievalCode0
Show:102550
← PrevPage 5 of 5Next →

No leaderboard results yet.