SOTAVerified

Zero-shot Image Retrieval

Papers

Showing 1120 of 29 papers

TitleStatusHype
FLAVA: A Foundational Language And Vision Alignment ModelCode1
Revisiting CLIP: Efficient Alignment of 3D MRI and Tabular Data using Domain-Specific Foundation ModelsCode0
CLIP-PING: Boosting Lightweight Vision-Language Models with Proximus Intrinsic Neighbors Guidance0
Piecewise-Linear Manifolds for Deep Metric Learning0
M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient PretrainingCode0
GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training0
ERNIE-ViL 2.0: Multi-view Contrastive Learning for Image-Text Pre-trainingCode0
Curriculum Learning for Data-Efficient Vision-Language Alignment0
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training BenchmarkCode0
Visual Representation Learning with Self-Supervised Attention for Low-Label High-data RegimeCode0
Show:102550
← PrevPage 2 of 3Next →

No leaderboard results yet.