SOTAVerified

cross-modal alignment

Papers

Showing 311320 of 342 papers

TitleStatusHype
Masked Vision and Language Modeling for Multi-modal Representation Learning0
Cross-Modal Alignment Learning of Vision-Language Conceptual Systems0
A Priority Map for Vision-and-Language Navigation with Trajectory Plans and Feature-Location CuesCode0
VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix0
Reinforced Cross-modal Alignment for Radiology Report GenerationCode0
LayoutLMv3: Pre-training for Document AI with Unified Text and Image MaskingCode0
mSLAM: Massively multilingual joint pre-training for speech and text0
ERNIE-Layout: Layout-Knowledge Enhanced Multi-modal Pre-training for Document UnderstandingCode0
KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation0
Learning Better Visual Representations for Weakly-Supervised Object Detection Using Natural Language Supervision0
Show:102550
← PrevPage 32 of 35Next →

No leaderboard results yet.