SOTAVerified

Image-text Retrieval

Papers

Showing 1120 of 248 papers

TitleStatusHype
AGATE: Stealthy Black-box Watermarking for Multimodal Model Copyright Protection0
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs0
FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations0
Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language ModelsCode1
SeLIP: Similarity Enhanced Contrastive Language Image Pretraining for Multi-modal Head MRI0
Med3DVLM: An Efficient Vision-Language Model for 3D Medical Image AnalysisCode2
Anatomy-Aware Conditional Image-Text Retrieval0
Variance-Aware Loss Scheduling for Multimodal Alignment in Low-Data Settings0
LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning0
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations0
Show:102550
← PrevPage 2 of 25Next →

No leaderboard results yet.