SOTAVerified

Video-Text Retrieval

Video-Text retrieval requires understanding of both video and language together. Therefore it's different to video retrieval task.

Papers

Showing 6170 of 111 papers

TitleStatusHype
UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal ModelingCode1
Tagging before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval0
Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge TransferringCode1
MV-Adapter: Multimodal Video Transfer Learning for Video Text RetrievalCode1
Test of Time: Instilling Video-Language Models with a Sense of TimeCode1
HiVLP: Hierarchical Interactive Video-Language Pre-Training0
Dual Alignment Unsupervised Domain Adaptation for Video-Text Retrieval0
ViLEM: Visual-Language Error Modeling for Image-Text Retrieval0
Masked Contrastive Pre-Training for Efficient Video-Text Retrieval0
Seeing What You Miss: Vision-Language Pre-training with Semantic Completion LearningCode1
Show:102550
← PrevPage 7 of 12Next →

No leaderboard results yet.