SOTAVerified

Image-text Retrieval

Papers

Showing 1120 of 248 papers

TitleStatusHype
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific LiteratureCode2
Towards Vision-Language Geo-Foundation Model: A SurveyCode2
RWKV-CLIP: A Robust Vision-Language Representation LearnerCode2
Accelerating Transformers with Spectrum-Preserving Token MergingCode2
DreamLIP: Language-Image Pre-training with Long CaptionsCode2
Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text RetrievalCode2
Frozen Transformers in Language Models Are Effective Visual Encoder LayersCode2
VeCLIP: Improving CLIP Training via Visual-enriched CaptionsCode2
RemoteCLIP: A Vision Language Foundation Model for Remote SensingCode2
PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical DocumentsCode2
Show:102550
← PrevPage 2 of 25Next →

No leaderboard results yet.