SOTAVerified

Image-text Retrieval

Papers

Showing 91100 of 248 papers

TitleStatusHype
Learning the Best Pooling Strategy for Visual Semantic EmbeddingCode1
A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and ReportsCode1
Graph Optimal Transport for Cross-Domain AlignmentCode1
Large-Scale Adversarial Training for Vision-and-Language Representation LearningCode1
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal TransformersCode1
IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text RetrievalCode1
Cross-modal Scene Graph Matching for Relationship-aware Image-Text RetrievalCode1
UNITER: UNiversal Image-TExt Representation LearningCode1
Maximal Matching Matters: Preventing Representation Collapse for Robust Cross-Modal Retrieval0
Adding simple structure at inference improves Vision-Language CompositionalityCode0
Show:102550
← PrevPage 10 of 25Next →

No leaderboard results yet.