SOTAVerified

cross-modal alignment

Papers

Showing 3140 of 342 papers

TitleStatusHype
U-SAM: An audio language Model for Unified Speech, Audio, and Music UnderstandingCode1
MSCI: Addressing CLIP's Inherent Limitations for Compositional Zero-Shot LearningCode1
CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained AlignmentCode1
Multimodal Fusion and Vision-Language Models: A Survey for Robot VisionCode1
BiPVL-Seg: Bidirectional Progressive Vision-Language Fusion with Global-Local Alignment for Medical Image SegmentationCode1
LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic SegmentationCode1
CoMP: Continual Multimodal Pre-training for Vision Foundation ModelsCode1
Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal RepresentationsCode1
Cross-modal Causal Relation Alignment for Video Question GroundingCode1
SwimVG: Step-wise Multimodal Fusion and Adaption for Visual GroundingCode1
Show:102550
← PrevPage 4 of 35Next →

No leaderboard results yet.