SOTAVerified

Image-text Retrieval

Papers

Showing 1120 of 248 papers

TitleStatusHype
RWKV-CLIP: A Robust Vision-Language Representation LearnerCode2
MedCLIP: Contrastive Learning from Unpaired Medical Images and TextCode2
Frozen Transformers in Language Models Are Effective Visual Encoder LayersCode2
PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical DocumentsCode2
VeCLIP: Improving CLIP Training via Visual-enriched CaptionsCode2
DreamLIP: Language-Image Pre-training with Long CaptionsCode2
Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text RetrievalCode2
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific LiteratureCode2
Cross-lingual and Multilingual CLIPCode2
FlagEvalMM: A Flexible Framework for Comprehensive Multimodal Model EvaluationCode2
Show:102550
← PrevPage 2 of 25Next →

No leaderboard results yet.