SOTAVerified

Zero-shot Text-to-Image Retrieval

Papers

Showing 1115 of 15 papers

TitleStatusHype
ERNIE-ViL 2.0: Multi-view Contrastive Learning for Image-Text Pre-trainingCode0
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset0
CLIP-PING: Boosting Lightweight Vision-Language Models with Proximus Intrinsic Neighbors Guidance0
CAVL: Learning Contrastive and Adaptive Representations of Vision and Language0
An analysis of vision-language models for fabric retrieval0
Show:102550
← PrevPage 2 of 2Next →

No leaderboard results yet.