SOTAVerified

Text to Video Retrieval

She's gone I can't find her anywhere I'm looking everywhere for her Everywhere is dark

Papers

Showing 1120 of 75 papers

TitleStatusHype
Sakuga-42M Dataset: Scaling Up Cartoon Research0
Learning text-to-video retrieval from image captioning0
Distilling Vision-Language Models on Millions of Videos0
Holistic Features are almost Sufficient for Text-to-Video RetrievalCode1
Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation LearningCode1
Leveraging Generative Language Models for Weakly Supervised Sentence Component Analysis in Video-Language Joint Learning0
E-ViLM: Efficient Video-Language Model via Masked Video Modeling with Semantic Vector-Quantized Tokenizer0
VideoCon: Robust Video-Language Alignment via Contrast CaptionsCode1
An Empirical Study of Frame Selection for Text-to-Video Retrieval0
Building an Open-Vocabulary Video CLIP Model with Better Architectures, Optimization and DataCode1
Show:102550
← PrevPage 2 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FROZEN-revisedmAP23.39Unverified
2FROZEN-revised (two-stream)text-to-video R@112.8Unverified
#ModelMetricClaimedVerifiedStatus
1CLIP4Cliptext-to-video R@144.5Unverified
#ModelMetricClaimedVerifiedStatus
1X-CLIP (Cross-Lingual)R@132.3Unverified