SOTAVerified

Image-text Retrieval

Papers

Showing 176200 of 248 papers

TitleStatusHype
Exposing and Mitigating Spurious Correlations for Cross-Modal RetrievalCode0
Scene Graph Based Fusion Network For Image-Text Retrieval0
Efficient Image-Text Retrieval via Keyword-Guided Pre-Screening0
Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning0
Semantic-Preserving Augmentation for Robust Image-Text RetrievalCode0
The style transformer with common knowledge optimization for image-text retrieval0
Differentiable Outlier Detection Enable Robust Deep Multimodal AnalysisCode0
USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text RetrievalCode0
HADA: A Graph-based Amalgamation Framework in Image-text RetrievalCode0
NAPReg: Nouns As Proxies Regularization for Semantically Aware Cross-Modal EmbeddingsCode0
VL-Match: Enhancing Vision-Language Pretraining with Token-Level and Instance-Level Matching0
Multilateral Semantic Relations Modeling for Image Text Retrieval0
GAFNet: A Global Fourier Self Attention Based Novel Network for multi-modal downstream tasks0
ViLEM: Visual-Language Error Modeling for Image-Text Retrieval0
Efficient Image Captioning for Edge Devices0
HGAN: Hierarchical Graph Alignment Network for Image-Text Retrieval0
NLIP: Noise-robust Language-Image Pre-training0
Scale-Semantic Joint Decoupling Network for Image-text Retrieval in Remote Sensing0
Masked Contrastive Pre-Training for Efficient Video-Text Retrieval0
Generative Negative Text Replay for Continual Vision-Language Pretraining0
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data0
Dissecting Deep Metric Learning Losses for Image-Text RetrievalCode0
Image-Text Retrieval with Binary and Continuous Label Supervision0
CPL: Counterfactual Prompt Learning for Vision and Language Models0
MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning0
Show:102550
← PrevPage 8 of 10Next →

No leaderboard results yet.