| OV-SCAN: Semantically Consistent Alignment for Novel Object Discovery in Open-Vocabulary 3D Object Detection | Mar 9, 2025 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| RCRank: Multimodal Ranking of Root Causes of Slow Queries in Cloud Database Systems | Mar 6, 2025 | cross-modal alignment | CodeCode Available | 0 |
| Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data | Mar 3, 2025 | cross-modal alignmentStyle Transfer | —Unverified | 0 |
| Language Model Mapping in Multimodal Music Learning: A Grand Challenge Proposal | Mar 1, 2025 | cross-modal alignmentLanguage Modeling | —Unverified | 0 |
| UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting | Feb 25, 2025 | 3DGScross-modal alignment | —Unverified | 0 |
| DUNIA: Pixel-Sized Embeddings via Cross-Modal Alignment for Earth Observation Applications | Feb 24, 2025 | cross-modal alignmentEarth Observation | —Unverified | 0 |
| MV-CLAM: Multi-View Molecular Interpretation with Cross-Modal Projection via Language Model | Feb 23, 2025 | cross-modal alignmentLanguage Modeling | CodeCode Available | 0 |
| CardiacMamba: A Multimodal RGB-RF Fusion Framework with State Space Models for Remote Physiological Measurement | Feb 19, 2025 | cross-modal alignmentFairness | CodeCode Available | 0 |
| NOTA: Multimodal Music Notation Understanding for Visual Large Language Model | Feb 17, 2025 | cross-modal alignmentLanguage Modeling | —Unverified | 0 |
| A Survey of Automatic Prompt Engineering: An Optimization Perspective | Feb 17, 2025 | cross-modal alignmentPrompt Engineering | —Unverified | 0 |
| MDE: Modality Discrimination Enhancement for Multi-modal Recommendation | Feb 8, 2025 | cross-modal alignmentMulti-modal Recommendation | —Unverified | 0 |
| Leveraging Pre-Trained Models for Multimodal Class-Incremental Learning under Adaptive Fusion | Feb 7, 2025 | class-incremental learningClass Incremental Learning | —Unverified | 0 |
| Cross-modal Context Fusion and Adaptive Graph Convolutional Network for Multimodal Conversational Emotion Recognition | Jan 25, 2025 | cross-modal alignmentEmotion Classification | —Unverified | 0 |
| Integrate Temporal Graph Learning into LLM-based Temporal Knowledge Graph Model | Jan 21, 2025 | cross-modal alignmentGraph Embedding | —Unverified | 0 |
| CGP-Tuning: Structure-Aware Soft Prompt Tuning for Code Vulnerability Detection | Jan 8, 2025 | Computational Efficiencycross-modal alignment | —Unverified | 0 |
| Audio-Visual Semantic Graph Network for Audio-Visual Event Localization | Jan 1, 2025 | audio-visual event localizationcross-modal alignment | —Unverified | 0 |
| Generalized Zero-Shot Classification via Semantics-Free Inter-Class Feature Generation | Jan 1, 2025 | Classificationcross-modal alignment | —Unverified | 0 |
| Chat-based Person Retrieval via Dialogue-Refined Cross-Modal Alignment | Jan 1, 2025 | Attributecross-modal alignment | —Unverified | 0 |
| Enhancing Multimodal Emotion Recognition through Multi-Granularity Cross-Modal Alignment | Dec 30, 2024 | cross-modal alignmentEmotion Recognition | —Unverified | 0 |
| ChartAdapter: Large Vision-Language Model for Chart Summarization | Dec 30, 2024 | Chart Understandingcross-modal alignment | —Unverified | 0 |
| Enhancing Visual Representation for Text-based Person Searching | Dec 30, 2024 | cross-modal alignmentPerson Search | CodeCode Available | 0 |
| Bag of Tricks for Multimodal AutoML with Image, Text, and Tabular Data | Dec 19, 2024 | AutoMLcross-modal alignment | —Unverified | 0 |
| RAC3: Retrieval-Augmented Corner Case Comprehension for Autonomous Driving with Vision-Language Models | Dec 15, 2024 | Autonomous DrivingContrastive Learning | —Unverified | 0 |
| Wearable Accelerometer Foundation Models for Health via Knowledge Distillation | Dec 15, 2024 | Activity Recognitioncross-modal alignment | —Unverified | 0 |
| Dynamic Cross-Modal Alignment for Robust Semantic Location Prediction | Dec 13, 2024 | cross-modal alignmentPrediction | —Unverified | 0 |