SOTAVerified

Image-text Retrieval

Papers

Showing 5160 of 248 papers

TitleStatusHype
Multi-modal Pre-training for Medical Vision-language Understanding and Generation: An Empirical Study with A New BenchmarkCode1
Revisiting the Role of Language Priors in Vision-Language ModelsCode1
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language TransformersCode1
S-CLIP: Semi-supervised Vision-Language Learning using Few Specialist CaptionsCode1
Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense CaptionerCode1
Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision TransformersCode1
From Association to Generation: Text-only Captioning by Unsupervised Cross-modal MappingCode1
Learnable Pillar-based Re-ranking for Image-Text RetrievalCode1
Rethinking Benchmarks for Cross-modal Image-text RetrievalCode1
Image-text Retrieval via Preserving Main Semantics of VisionCode1
Show:102550
← PrevPage 6 of 25Next →

No leaderboard results yet.