SOTAVerified

Text to Video Retrieval

She's gone I can't find her anywhere I'm looking everywhere for her Everywhere is dark

Papers

Showing 2130 of 75 papers

TitleStatusHype
X-Pool: Cross-Modal Language-Video Attention for Text-Video RetrievalCode1
Revitalize Region Feature for Democratizing Video-Language Pre-training of RetrievalCode1
Reading-strategy Inspired Visual Representation Learning for Text-to-Video RetrievalCode1
Bridging Video-text Retrieval with Multiple Choice QuestionsCode1
Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalCode1
VIOLET : End-to-End Video-Language Transformers with Masked Visual-token ModelingCode1
Advancing High-Resolution Video-Language Representation with Large-Scale Video TranscriptionsCode1
VALUE: A Multi-Task Benchmark for Video-and-Language Understanding EvaluationCode1
DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy MinimizationCode1
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled VideosCode1
Show:102550
← PrevPage 3 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FROZEN-revisedmAP23.39Unverified
2FROZEN-revised (two-stream)text-to-video R@112.8Unverified
#ModelMetricClaimedVerifiedStatus
1CLIP4Cliptext-to-video R@144.5Unverified
#ModelMetricClaimedVerifiedStatus
1X-CLIP (Cross-Lingual)R@132.3Unverified