SOTAVerified

cross-modal alignment

Papers

Showing 4150 of 342 papers

TitleStatusHype
CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modallyCode1
WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher LearningCode1
Diffusion Bridge: Leveraging Diffusion Model to Reduce the Modality Gap Between Text and Vision for Zero-Shot Image CaptioningCode1
Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Large Model EnhancementCode1
Free Lunch Enhancements for Multi-modal Crowd CountingCode1
ASAP: Advancing Semantic Alignment Promotes Multi-Modal Manipulation Detecting and GroundingCode1
GEAL: Generalizable 3D Affordance Learning with Cross-Modal ConsistencyCode1
Multimodal Music Generation with Explicit Bridges and Retrieval AugmentationCode1
Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language ModelCode1
SimCMF: A Simple Cross-modal Fine-tuning Strategy from Vision Foundation Models to Any Imaging ModalityCode1
Show:102550
← PrevPage 5 of 35Next →

No leaderboard results yet.