SOTAVerified

Image-text Retrieval

Papers

Showing 2130 of 248 papers

TitleStatusHype
MedCLIP: Contrastive Learning from Unpaired Medical Images and TextCode2
Cross-lingual and Multilingual CLIPCode2
Vision-Language Pre-Training with Triple Contrastive LearningCode2
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine LearningCode2
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text SupervisionCode2
Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language ModelsCode1
ReCon: Enhancing True Correspondence Discrimination through Relation Consistency for Robust Noisy Correspondence LearningCode1
I0T: Embedding Standardization Method Towards Zero Modality GapCode1
A Survey of Medical Vision-and-Language Applications and Their TechniquesCode1
Nearest Neighbor Normalization Improves Multimodal RetrievalCode1
Show:102550
← PrevPage 3 of 25Next →

No leaderboard results yet.