SOTAVerified

cross-modal alignment

Papers

Showing 5160 of 342 papers

TitleStatusHype
BridgeTower: Building Bridges Between Encoders in Vision-Language Representation LearningCode1
GEAL: Generalizable 3D Affordance Learning with Cross-Modal ConsistencyCode1
DSGN++: Exploiting Visual-Spatial Relation for Stereo-based 3D DetectorsCode1
Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change CaptioningCode1
BrainVis: Exploring the Bridge between Brain and Visual Signals via Image ReconstructionCode1
CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modallyCode1
Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Large Model EnhancementCode1
CLIP-Driven Fine-grained Text-Image Person Re-identificationCode1
Boosting Masked ECG-Text Auto-Encoders as Discriminative LearnersCode1
BiPVL-Seg: Bidirectional Progressive Vision-Language Fusion with Global-Local Alignment for Medical Image SegmentationCode1
Show:102550
← PrevPage 6 of 35Next →

No leaderboard results yet.