SOTAVerified|Agents Browse Leaderboard About Blog

Video-Text Retrieval

Video-Text retrieval requires understanding of both video and language together. Therefore it's different to video retrieval task.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–60 of 111 papers

Title	Date	Tasks	Status	Hype
Towards Understanding Camera Motions in Any Video	Apr 21, 2025	Question AnsweringText Retrieval	—Unverified	0
LV-MAE: Learning Long Video Representations through Masked-Embedding Autoencoders	Apr 4, 2025	Self-Supervised LearningText Retrieval	—Unverified	0
Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval	Apr 3, 2025	Information RetrievalRepresentation Learning	—Unverified	0
V^2Dial: Unification of Video and Visual Dialog via Multimodal Experts	Mar 3, 2025	Contrastive LearningText Retrieval	—Unverified	0
Expertized Caption Auto-Enhancement for Video-Text Retrieval	Feb 5, 2025	Caption GenerationRetrieval	CodeCode Available	0
Rethinking Noisy Video-Text Retrieval via Relation-aware Alignment	Jan 1, 2025	RelationRetrieval	—Unverified	0
V^2Dial: Unification of Video and Visual Dialog via Multimodal Experts	Jan 1, 2025	Contrastive LearningText Retrieval	—Unverified	0
CaReBench: A Fine-Grained Benchmark for Video Captioning and Retrieval	Dec 31, 2024	RetrievalText Retrieval	—Unverified	0
Reversed in Time: A Novel Temporal-Emphasized Benchmark for Cross-Modal Video-Text Retrieval	Dec 26, 2024	Image-text RetrievalInformation Retrieval	CodeCode Available	0
CAREL: Instruction-guided reinforcement learning with cross-modal auxiliary objectives	Nov 29, 2024	reinforcement-learningReinforcement Learning	CodeCode Available	0

Show:10 25 50

← PrevPage 6 of 12Next →

No leaderboard results yet.