SOTAVerified

cross-modal alignment

Papers

Showing 181190 of 342 papers

TitleStatusHype
Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment0
UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting0
Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces0
Video Referring Expression Comprehension via Transformer with Content-aware Query0
Video Referring Expression Comprehension via Transformer with Content-conditioned Query0
ViTaPEs: Visuotactile Position Encodings for Cross-Modal Alignment in Multimodal Transformers0
VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix0
VLMT: Vision-Language Multimodal Transformer for Multimodal Multi-hop Question Answering0
Wearable Accelerometer Foundation Models for Health via Knowledge Distillation0
WhisQ: Cross-Modal Representation Learning for Text-to-Music MOS Prediction0
Show:102550
← PrevPage 19 of 35Next →

No leaderboard results yet.