| Continuous Sign Language Recognition Through Cross-Modal Alignment of Video and Text Embeddings in a Joint-Latent Space | May 11, 2020 | cross-modal alignmentDecoder | —Unverified | 0 |
| CoVLR: Coordinating Cross-Modal Consistency and Intra-Modal Structure for Vision-Language Retrieval | Apr 15, 2023 | cross-modal alignmentCross-Modal Retrieval | —Unverified | 0 |
| Cross-aware Early Fusion with Stage-divided Vision and Language Transformer Encoders for Referring Image Segmentation | Aug 14, 2024 | cross-modal alignmentImage Segmentation | —Unverified | 0 |
| Cross-Modal Alignment Learning of Vision-Language Conceptual Systems | Jul 31, 2022 | cross-modal alignmentRepresentation Learning | —Unverified | 0 |
| Cross-Modal Alignment with Mixture Experts Neural Network for Intral-City Retail Recommendation | Sep 17, 2020 | cross-modal alignmentImage to text | —Unverified | 0 |
| Cross-modal Alignment with Optimal Transport for CTC-based ASR | Sep 24, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Cross-Modal Attention Alignment Network with Auxiliary Text Description for zero-shot sketch-based image retrieval | Jul 1, 2024 | cross-modal alignmentImage Retrieval | —Unverified | 0 |
| Cross-modal Context Fusion and Adaptive Graph Convolutional Network for Multimodal Conversational Emotion Recognition | Jan 25, 2025 | cross-modal alignmentEmotion Classification | —Unverified | 0 |
| Cross-Modal Cross-Domain Moment Alignment Network for Person Search | Jun 1, 2020 | cross-modal alignmentPerson Search | —Unverified | 0 |
| Cross-Modal Denoising: A Novel Training Paradigm for Enhancing Speech-Image Retrieval | Aug 15, 2024 | cross-modal alignmentDenoising | —Unverified | 0 |
| Cross-Modal Prototype based Multimodal Federated Learning under Severely Missing Modality | Jan 25, 2024 | cross-modal alignmentFederated Learning | —Unverified | 0 |
| Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval | Oct 17, 2022 | cross-modal alignmentObject | —Unverified | 0 |
| CTPD: Cross-Modal Temporal Pattern Discovery for Enhanced Multimodal Electronic Health Records Analysis | Nov 1, 2024 | cross-modal alignmentPhenotype classification | —Unverified | 0 |
| Curriculum Audiovisual Learning | Jan 26, 2020 | Clusteringcross-modal alignment | —Unverified | 0 |
| DALR: Dual-level Alignment Learning for Multimodal Sentence Representation Learning | Jun 26, 2025 | cross-modal alignmentRepresentation Learning | —Unverified | 0 |
| DAP: Domain-aware Prompt Learning for Vision-and-Language Navigation | Nov 29, 2023 | cross-modal alignmentNavigate | —Unverified | 0 |
| Towards Brain Passage Retrieval -- An Investigation of EEG Query Representations | Dec 9, 2024 | cross-modal alignmentEEG | —Unverified | 0 |
| Deformable Attentive Visual Enhancement for Referring Segmentation Using Vision-Language Model | May 25, 2025 | cross-modal alignmentImage Segmentation | —Unverified | 0 |
| Denoising and Alignment: Rethinking Domain Generalization for Multimodal Face Anti-Spoofing | May 14, 2025 | cross-modal alignmentDenoising | —Unverified | 0 |
| DenseGrounding: Improving Dense Language-Vision Semantics for Ego-Centric 3D Visual Grounding | May 8, 2025 | 3D visual groundingcross-modal alignment | —Unverified | 0 |
| Detection-based Intermediate Supervision for Visual Question Answering | Dec 26, 2023 | cross-modal alignmentLogical Reasoning | —Unverified | 0 |
| DF-Calib: Targetless LiDAR-Camera Calibration via Depth Flow | Apr 2, 2025 | Autonomous DrivingCamera Calibration | —Unverified | 0 |
| DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment | Aug 22, 2023 | AttributeConstituency Parsing | —Unverified | 0 |
| DiSa: Directional Saliency-Aware Prompt Learning for Generalizable Vision-Language Models | May 26, 2025 | cross-modal alignmentDomain Generalization | —Unverified | 0 |
| Disentangled Noisy Correspondence Learning | Aug 10, 2024 | cross-modal alignmentCross-Modal Retrieval | —Unverified | 0 |
| Does Vision Accelerate Hierarchical Generalization in Neural Language Learners? | Feb 1, 2023 | cross-modal alignmentLanguage Acquisition | —Unverified | 0 |
| Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs | Jun 5, 2025 | cross-modal alignmentDense Captioning | —Unverified | 0 |
| Technical Approach for the EMI Challenge in the 8th Affective Behavior Analysis in-the-Wild Competition | Mar 13, 2025 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| DUNIA: Pixel-Sized Embeddings via Cross-Modal Alignment for Earth Observation Applications | Feb 24, 2025 | cross-modal alignmentEarth Observation | —Unverified | 0 |
| Dynamic Cross-Modal Alignment for Robust Semantic Location Prediction | Dec 13, 2024 | cross-modal alignmentPrediction | —Unverified | 0 |
| EA-VTR: Event-Aware Video-Text Retrieval | Jul 10, 2024 | Action RecognitionContrastive Learning | —Unverified | 0 |
| EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment | Oct 8, 2024 | cross-modal alignmentHallucination | —Unverified | 0 |
| EmotionRankCLAP: Bridging Natural Language Speaking Styles and Ordinal Speech Emotion via Rank-N-Contrast | May 29, 2025 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| End-to-end Semantic Object Detection with Cross-Modal Alignment | Feb 10, 2023 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| Enhancing Emotion Recognition in Incomplete Data: A Novel Cross-Modal Alignment, Reconstruction, and Refinement Framework | Jul 12, 2024 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| Enhancing LLMs for Time Series Forecasting via Structure-Guided Cross-Modal Alignment | May 19, 2025 | cross-modal alignmentTime Series | —Unverified | 0 |
| Enhancing Modality Representation and Alignment for Multimodal Cold-start Active Learning | Dec 12, 2024 | Active Learningcross-modal alignment | —Unverified | 0 |
| Enhancing Multimodal Emotion Recognition through Multi-Granularity Cross-Modal Alignment | Dec 30, 2024 | cross-modal alignmentEmotion Recognition | —Unverified | 0 |
| Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data | Mar 3, 2025 | cross-modal alignmentStyle Transfer | —Unverified | 0 |
| Evaluating Attribute Confusion in Fashion Text-to-Image Generation | Jul 9, 2025 | Attributecross-modal alignment | —Unverified | 0 |
| Exploring Information-Theoretic Metrics Associated with Neural Collapse in Supervised Training | Sep 25, 2024 | Classificationcross-modal alignment | —Unverified | 0 |
| Fine-grained Semantic Alignment Network for Weakly Supervised Temporal Language Grounding | Oct 21, 2022 | cross-modal alignmentSentence | —Unverified | 0 |
| FineLIP: Extending CLIP's Reach via Fine-Grained Alignment with Longer Text Inputs | Apr 2, 2025 | cross-modal alignmentCross-Modal Retrieval | —Unverified | 0 |
| From Alignment to Advancement: Bootstrapping Audio-Language Alignment with Synthetic Data | May 26, 2025 | cross-modal alignmentInstruction Following | —Unverified | 0 |
| Fully Aligned Network for Referring Image Segmentation | Sep 29, 2024 | cross-modal alignmentDecoder | —Unverified | 0 |
| Fusing Cross-modal and Uni-modal Representations: A Kronecker Product Approach | Jun 10, 2025 | cross-modal alignment | —Unverified | 0 |
| GALLa: Graph Aligned Large Language Models for Improved Source Code Understanding | Sep 6, 2024 | cross-modal alignmentLanguage Modelling | —Unverified | 0 |
| GatedxLSTM: A Multimodal Affective Computing Approach for Emotion Recognition in Conversations | Mar 26, 2025 | cross-modal alignmentEmotion Classification | —Unverified | 0 |
| Generalized Zero-Shot Classification via Semantics-Free Inter-Class Feature Generation | Jan 1, 2025 | Classificationcross-modal alignment | —Unverified | 0 |
| Generating Vision-Language Navigation Instructions Incorporated Fine-Grained Alignment Annotations | Jun 10, 2025 | cross-modal alignmentNavigate | —Unverified | 0 |