SOTAVerified

Image-text Retrieval

Papers

Showing 5175 of 248 papers

TitleStatusHype
Multi-modal Pre-training for Medical Vision-language Understanding and Generation: An Empirical Study with A New BenchmarkCode1
Revisiting the Role of Language Priors in Vision-Language ModelsCode1
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language TransformersCode1
S-CLIP: Semi-supervised Vision-Language Learning using Few Specialist CaptionsCode1
Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense CaptionerCode1
Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision TransformersCode1
From Association to Generation: Text-only Captioning by Unsupervised Cross-modal MappingCode1
Learnable Pillar-based Re-ranking for Image-Text RetrievalCode1
Rethinking Benchmarks for Cross-modal Image-text RetrievalCode1
Image-text Retrieval via Preserving Main Semantics of VisionCode1
Hyperbolic Image-Text RepresentationsCode1
Equivariant Similarity for Vision-Language Foundation ModelsCode1
Multimodal Federated Learning via Contrastive Representation EnsembleCode1
UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal ModelingCode1
LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text RetrievalCode1
UPop: Unified and Progressive Pruning for Compressing Vision-Language TransformersCode1
LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse RetrievalCode1
Benchmarking Robustness of Multimodal Image-Text Models under Distribution ShiftCode1
FlexiViT: One Model for All Patch SizesCode1
ComCLIP: Training-Free Compositional Image and Text MatchingCode1
Seeing What You Miss: Vision-Language Pre-training with Semantic Completion LearningCode1
MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training ModelCode1
Mr. Right: Multimodal Retrieval on Representation of ImaGe witH TextCode1
FETA: Towards Specializing Foundation Models for Expert Task ApplicationsCode1
Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical AlignmentCode1
Show:102550
← PrevPage 3 of 10Next →

No leaderboard results yet.