| MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment Retrieval | Jun 25, 2024 | cross-modal alignmentMoment Retrieval | —Unverified | 0 |
| It is Never Too Late to Mend: Separate Learning for Multimedia Recommendation | Jun 12, 2024 | cross-modal alignmentMultimedia recommendation | CodeCode Available | 0 |
| Hire: Hybrid-modal Interaction with Multiple Relational Enhancements for Image-Text Matching | Jun 5, 2024 | cross-modal alignmentImage-text matching | —Unverified | 0 |
| Multimodal Reasoning with Multimodal Knowledge Graph | Jun 4, 2024 | cross-modal alignmentGraph Attention | —Unverified | 0 |
| OmniBind: Teach to Build Unequal-Scale Modality Interaction for Omni-Bind of All | May 25, 2024 | Allcross-modal alignment | —Unverified | 0 |
| AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability | May 23, 2024 | cross-modal alignmentLanguage Modelling | —Unverified | 0 |
| Context-Enhanced Video Moment Retrieval with Large Language Models | May 21, 2024 | cross-modal alignmentLanguage Modeling | —Unverified | 0 |
| Listen Then See: Video Alignment with Speaker Attention | Apr 21, 2024 | cross-modal alignmentQuestion Answering | CodeCode Available | 0 |
| Distributionally Robust Alignment for Medical Federated Vision-Language Pre-training Under Data Heterogeneity | Apr 5, 2024 | cross-modal alignmentFederated Learning | —Unverified | 0 |
| CIRP: Cross-Item Relational Pre-training for Multimodal Product Bundling | Apr 2, 2024 | cross-modal alignmentGraph Learning | —Unverified | 0 |