| Cross-Modal Alignment Learning of Vision-Language Conceptual Systems | Jul 31, 2022 | cross-modal alignmentRepresentation Learning | —Unverified | 0 |
| A Priority Map for Vision-and-Language Navigation with Trajectory Plans and Feature-Location Cues | Jul 24, 2022 | cross-modal alignmentTrajectory Planning | CodeCode Available | 0 |
| BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning | Jun 17, 2022 | cross-modal alignmentRepresentation Learning | CodeCode Available | 1 |
| VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix | Jun 17, 2022 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections | May 24, 2022 | Computational Efficiencycross-modal alignment | CodeCode Available | 1 |
| Reinforced Cross-modal Alignment for Radiology Report Generation | May 1, 2022 | cross-modal alignmentDecision Making | CodeCode Available | 0 |
| LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking | Apr 18, 2022 | cross-modal alignmentDocument AI | CodeCode Available | 0 |
| DSGN++: Exploiting Visual-Spatial Relation for Stereo-based 3D Detectors | Apr 6, 2022 | 3D geometry3D Object Detection | CodeCode Available | 1 |
| Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding | Apr 4, 2022 | cross-modal alignmentNatural Language Queries | CodeCode Available | 1 |
| Vision-Language Pre-Training with Triple Contrastive Learning | Feb 21, 2022 | Contrastive Learningcross-modal alignment | CodeCode Available | 2 |