SOTAVerified

Image-text Retrieval

Papers

Showing 141150 of 248 papers

TitleStatusHype
EvdCLIP: Improving Vision-Language Retrieval with Entity Visual Descriptions from Large Language Models0
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE0
Explaining and Mitigating the Modality Gap in Contrastive Multimodal Learning0
Fine-tuning Multimodal Transformers on Edge: A Parallel Split Learning Approach0
FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations0
Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks0
GAFNet: A Global Fourier Self Attention Based Novel Network for multi-modal downstream tasks0
Generative Negative Text Replay for Continual Vision-Language Pretraining0
Global–Local Information Soft-Alignment for Cross-Modal Remote-Sensing Image–Text Retrieval0
HGAN: Hierarchical Graph Alignment Network for Image-Text Retrieval0
Show:102550
← PrevPage 15 of 25Next →

No leaderboard results yet.