SOTAVerified

Video-Text Retrieval

Video-Text retrieval requires understanding of both video and language together. Therefore it's different to video retrieval task.

Papers

Showing 8190 of 111 papers

TitleStatusHype
Harvest Video Foundation Models via Efficient Post-Pretraining0
Generalizing Multimodal Pre-training into Multilingual via Language Acquisition0
Text-Adaptive Multiple Visual Prototype Matching for Video-Text Retrieval0
CaReBench: A Fine-Grained Benchmark for Video Captioning and Retrieval0
TokenFlow: Rethinking Fine-grained Cross-modal Alignment in Vision-Language Retrieval0
Towards Understanding Camera Motions in Any Video0
Uncertainty-Aware Alignment Network for Cross-Domain Video-Text Retrieval0
Uncertainty-Aware Alignment Network for Cross-Domain Video-Text Retrieval0
Uncertainty-aware sign language video retrieval with probability distribution modeling0
Exploiting Visual Semantic Reasoning for Video-Text Retrieval0
Show:102550
← PrevPage 9 of 12Next →

No leaderboard results yet.