| Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach | Aug 2, 2024 | cross-modal alignmentMultiple Object Tracking | CodeCode Available | 2 |
| Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment | Jul 26, 2024 | cross-modal alignmentimage-classification | —Unverified | 0 |
| DAC: 2D-3D Retrieval with Noisy Labels via Divide-and-Conquer Alignment and Correction | Jul 25, 2024 | cross-modal alignmentCross-Modal Retrieval | CodeCode Available | 0 |
| Multimodal Machine Learning in Mental Health: A Survey of Data, Algorithms, and Challenges | Jul 23, 2024 | cross-modal alignmentFairness | —Unverified | 0 |
| Craft: Cross-modal Aligned Features Improve Robustness of Prompt Tuning | Jul 22, 2024 | cross-modal alignment | CodeCode Available | 0 |
| Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment | Jul 18, 2024 | cross-modal alignmentCross-Modal Retrieval | CodeCode Available | 1 |
| Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning | Jul 16, 2024 | Caption Generationcross-modal alignment | CodeCode Available | 1 |
| Enhancing Emotion Recognition in Incomplete Data: A Novel Cross-Modal Alignment, Reconstruction, and Refinement Framework | Jul 12, 2024 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| EA-VTR: Event-Aware Video-Text Retrieval | Jul 10, 2024 | Action RecognitionContrastive Learning | —Unverified | 0 |
| Towards Bridging the Cross-modal Semantic Gap for Multi-modal Recommendation | Jul 7, 2024 | cross-modal alignmentMulti-modal Recommendation | CodeCode Available | 1 |