SOTAVerified

cross-modal alignment

Papers

Showing 101110 of 342 papers

TitleStatusHype
CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained AlignmentCode1
LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic SegmentationCode1
EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic SegmentationCode1
Diffusion Bridge: Leveraging Diffusion Model to Reduce the Modality Gap Between Text and Vision for Zero-Shot Image CaptioningCode1
ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented BenchmarksCode1
Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment RetrievalCode1
Adaptive Spatial Transcriptomics Interpolation via Cross-modal Cross-slice ModelingCode0
MicarVLMoE: A Modern Gated Cross-Aligned Vision-Language Mixture of Experts Model for Medical Image Captioning and Report GenerationCode0
A Priority Map for Vision-and-Language Navigation with Trajectory Plans and Feature-Location CuesCode0
Discrete Cross-Modal Alignment Enables Zero-Shot Speech TranslationCode0
Show:102550
← PrevPage 11 of 35Next →

No leaderboard results yet.