| Video Referring Expression Comprehension via Transformer with Content-aware Query | Oct 6, 2022 | cross-modal alignmentReferring Expression | —Unverified | 0 |
| JPG - Jointly Learn to Align: Automated Disease Prediction and Radiology Report Generation | Oct 1, 2022 | cross-modal alignmentDisease Prediction | —Unverified | 0 |
| Translation, Scale and Rotation: Cross-Modal Alignment Meets RGB-Infrared Vehicle Detection | Sep 28, 2022 | 2D Object Detectioncross-modal alignment | —Unverified | 0 |
| TokenFlow: Rethinking Fine-grained Cross-modal Alignment in Vision-Language Retrieval | Sep 28, 2022 | cross-modal alignmentRetrieval | —Unverified | 0 |
| Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval | Sep 23, 2022 | cross-modal alignmentInformation Retrieval | —Unverified | 0 |
| OmniVL:One Foundation Model for Image-Language and Video-Language Tasks | Sep 15, 2022 | Action ClassificationAction Recognition | —Unverified | 0 |
| Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment | Aug 29, 2022 | cross-modal alignmentImage-text Retrieval | CodeCode Available | 1 |
| See What You See: Self-supervised Cross-modal Retrieval of Visual Stimuli from Brain Activity | Aug 7, 2022 | cross-modal alignmentCross-Modal Retrieval | —Unverified | 0 |
| Fine-Grained Semantically Aligned Vision-Language Pre-Training | Aug 4, 2022 | cross-modal alignmentobject-detection | CodeCode Available | 1 |
| Masked Vision and Language Modeling for Multi-modal Representation Learning | Aug 3, 2022 | cross-modal alignmentLanguage Modeling | —Unverified | 0 |