| Masked Vision and Language Modeling for Multi-modal Representation Learning | Aug 3, 2022 | cross-modal alignmentLanguage Modeling | —Unverified | 0 |
| Cross-Modal Alignment Learning of Vision-Language Conceptual Systems | Jul 31, 2022 | cross-modal alignmentRepresentation Learning | —Unverified | 0 |
| A Priority Map for Vision-and-Language Navigation with Trajectory Plans and Feature-Location Cues | Jul 24, 2022 | cross-modal alignmentTrajectory Planning | CodeCode Available | 0 |
| VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix | Jun 17, 2022 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| Reinforced Cross-modal Alignment for Radiology Report Generation | May 1, 2022 | cross-modal alignmentDecision Making | CodeCode Available | 0 |
| LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking | Apr 18, 2022 | cross-modal alignmentDocument AI | CodeCode Available | 0 |
| mSLAM: Massively multilingual joint pre-training for speech and text | Feb 3, 2022 | cross-modal alignmentintent-classification | —Unverified | 0 |
| ERNIE-Layout: Layout-Knowledge Enhanced Multi-modal Pre-training for Document Understanding | Jan 16, 2022 | cross-modal alignmentDocument Classification | CodeCode Available | 0 |
| KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation | Jan 16, 2022 | cross-modal alignmentKnowledge Distillation | —Unverified | 0 |
| Learning Better Visual Representations for Weakly-Supervised Object Detection Using Natural Language Supervision | Sep 29, 2021 | cross-modal alignmentobject-detection | —Unverified | 0 |