| Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding | Apr 4, 2022 | cross-modal alignmentNatural Language Queries | CodeCode Available | 1 |
| SimCMF: A Simple Cross-modal Fine-tuning Strategy from Vision Foundation Models to Any Imaging Modality | Nov 27, 2024 | cross-modal alignment | CodeCode Available | 1 |
| PRESTO: Progressive Pretraining Enhances Synthetic Chemistry Outcomes | Jun 19, 2024 | cross-modal alignment | CodeCode Available | 1 |
| Diffusion Bridge: Leveraging Diffusion Model to Reduce the Modality Gap Between Text and Vision for Zero-Shot Image Captioning | Jan 1, 2025 | cross-modal alignmentDenoising | CodeCode Available | 1 |
| Symbiotic Adversarial Learning for Attribute-based Person Search | Jul 19, 2020 | Attributecross-modal alignment | CodeCode Available | 1 |
| The Devil is in the Details: Boosting Guided Depth Super-Resolution via Rethinking Cross-Modal Alignment and Aggregation | Jan 16, 2024 | cross-modal alignmentfeature selection | CodeCode Available | 1 |
| Enhancing LLMs for Time Series Forecasting via Structure-Guided Cross-Modal Alignment | May 19, 2025 | cross-modal alignmentTime Series | —Unverified | 0 |
| Co-AttenDWG: Co-Attentive Dimension-Wise Gating and Expert Fusion for Multi-Modal Offensive Content Detection | May 25, 2025 | cross-modal alignmentScene Understanding | —Unverified | 0 |
| Enhancing Emotion Recognition in Incomplete Data: A Novel Cross-Modal Alignment, Reconstruction, and Refinement Framework | Jul 12, 2024 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| EmotionRankCLAP: Bridging Natural Language Speaking Styles and Ordinal Speech Emotion via Rank-N-Contrast | May 29, 2025 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| Coarse-to-fine Alignment Makes Better Speech-image Retrieval | Aug 15, 2024 | cross-modal alignmentImage Retrieval | —Unverified | 0 |
| A Survey of Automatic Prompt Engineering: An Optimization Perspective | Feb 17, 2025 | cross-modal alignmentPrompt Engineering | —Unverified | 0 |
| EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment | Oct 8, 2024 | cross-modal alignmentHallucination | —Unverified | 0 |
| EA-VTR: Event-Aware Video-Text Retrieval | Jul 10, 2024 | Action RecognitionContrastive Learning | —Unverified | 0 |
| CLIP-PING: Boosting Lightweight Vision-Language Models with Proximus Intrinsic Neighbors Guidance | Dec 5, 2024 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| Dynamic Cross-Modal Alignment for Robust Semantic Location Prediction | Dec 13, 2024 | cross-modal alignmentPrediction | —Unverified | 0 |
| DUNIA: Pixel-Sized Embeddings via Cross-Modal Alignment for Earth Observation Applications | Feb 24, 2025 | cross-modal alignmentEarth Observation | —Unverified | 0 |
| Technical Approach for the EMI Challenge in the 8th Affective Behavior Analysis in-the-Wild Competition | Mar 13, 2025 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| Clip4Retrofit: Enabling Real-Time Image Labeling on Edge Devices via Cross-Architecture CLIP Distillation | May 23, 2025 | Autonomous Drivingcross-modal alignment | —Unverified | 0 |
| End-to-end Semantic Object Detection with Cross-Modal Alignment | Feb 10, 2023 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| ALAS: Measuring Latent Speech-Text Alignment For Spoken Language Understanding In Multimodal LLMs | May 26, 2025 | cross-modal alignmentEmotion Recognition | —Unverified | 0 |
| 4D-ACFNet: A 4D Attention Mechanism-Based Prognostic Framework for Colorectal Cancer Liver Metastasis Integrating Multimodal Spatiotemporal Features | Mar 12, 2025 | cross-modal alignmentDisentanglement | —Unverified | 0 |
| Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs | Jun 5, 2025 | cross-modal alignmentDense Captioning | —Unverified | 0 |
| Enhancing Modality Representation and Alignment for Multimodal Cold-start Active Learning | Dec 12, 2024 | Active Learningcross-modal alignment | —Unverified | 0 |
| Does Vision Accelerate Hierarchical Generalization in Neural Language Learners? | Feb 1, 2023 | cross-modal alignmentLanguage Acquisition | —Unverified | 0 |