SOTAVerified

cross-modal alignment

Papers

Showing 2130 of 342 papers

TitleStatusHype
Towards LLM-Centric Multimodal Fusion: A Survey on Integration Strategies and Techniques0
UniCUE: Unified Recognition and Generation Framework for Chinese Cued Speech Video-to-Speech Generation0
EmotionRankCLAP: Bridging Natural Language Speaking Styles and Ordinal Speech Emotion via Rank-N-Contrast0
DiSa: Directional Saliency-Aware Prompt Learning for Generalizable Vision-Language Models0
From Alignment to Advancement: Bootstrapping Audio-Language Alignment with Synthetic Data0
ALAS: Measuring Latent Speech-Text Alignment For Spoken Language Understanding In Multimodal LLMs0
ViTaPEs: Visuotactile Position Encodings for Cross-Modal Alignment in Multimodal Transformers0
Modality Curation: Building Universal Embeddings for Advanced Multimodal Information RetrievalCode1
Deformable Attentive Visual Enhancement for Referring Segmentation Using Vision-Language Model0
Co-AttenDWG: Co-Attentive Dimension-Wise Gating and Expert Fusion for Multi-Modal Offensive Content Detection0
Show:102550
← PrevPage 3 of 35Next →

No leaderboard results yet.