SOTAVerified|Agents Browse Leaderboard About Blog

Video-Text Retrieval

Video-Text retrieval requires understanding of both video and language together. Therefore it's different to video retrieval task.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 31–40 of 111 papers

Title	Date	Tasks	Status	Hype	Score
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections	May 24, 2022	Computational Efficiencycross-modal alignment	CodeCode Available	1	5
Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval	Sep 29, 2023	Cross-Modal RetrievalImage-text matching	CodeCode Available	1	5
Helping Hands: An Object-Aware Ego-Centric Video Recognition Model	Aug 15, 2023	DecoderObject	CodeCode Available	1	5
Text Proxy: Decomposing Retrieval from a 1-to-N Relationship into N 1-to-1 Relationships for Text-Video Retrieval	Oct 9, 2024	RetrievalText Retrieval	CodeCode Available	1	5
LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts	May 20, 2025	Caption GenerationRetrieval	CodeCode Available	1	5
Learning the Best Pooling Strategy for Visual Semantic Embedding	Nov 9, 2020	Cross-Modal Information RetrievalImage-text Retrieval	CodeCode Available	1	5
Cross-Modal Retrieval with Partially Mismatched Pairs	Feb 22, 2023	Contrastive LearningCross-Modal Retrieval	CodeCode Available	1	5
Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss	Sep 9, 2021	Mixture-of-ExpertsRetrieval	CodeCode Available	1	5
HANet: Hierarchical Alignment Networks for Video-Text Retrieval	Jul 26, 2021	RetrievalText Matching	CodeCode Available	1	5
Learning Video Context as Interleaved Multimodal Sequences	Jul 31, 2024	Language ModelingLanguage Modelling	CodeCode Available	1	5

Show:10 25 50

← PrevPage 4 of 12Next →

No leaderboard results yet.