SOTAVerified

Video-Text Retrieval

Video-Text retrieval requires understanding of both video and language together. Therefore it's different to video retrieval task.

Papers

Showing 2130 of 111 papers

TitleStatusHype
Multi-Scale Temporal Difference Transformer for Video-Text Retrieval0
Diving Deep into the Motion Representation of Video-Text ModelsCode0
HENASY: Learning to Assemble Scene-Entities for Egocentric Video-Language Model0
Uncertainty-aware sign language video retrieval with probability distribution modeling0
An Empirical Study of Excitation and Aggregation Design Adaptions in CLIP4Clip for Video-Text Retrieval0
RETTA: Retrieval-Enhanced Test-Time Adaptation for Zero-Shot Video Captioning0
Learning with Noisy Correspondence0
HaVTR: Improving Video-Text Retrieval Through Augmentation Using Large Foundation Models0
vid-TLDR: Training Free Token merging for Light-weight Video TransformerCode2
Unifying Latent and Lexicon Representations for Effective Video-Text Retrieval0
Show:102550
← PrevPage 3 of 12Next →

No leaderboard results yet.