| Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment | Sep 22, 2024 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| OneEncoder: A Lightweight Framework for Progressive Alignment of Modalities | Sep 17, 2024 | cross-modal alignmentQuestion Answering | —Unverified | 0 |
| CAST: Cross-modal Alignment Similarity Test for Vision Language Models | Sep 17, 2024 | cross-modal alignmentQuestion Answering | CodeCode Available | 0 |
| KALE: An Artwork Image Captioning System Augmented with Heterogeneous Graph | Sep 17, 2024 | cross-modal alignmentImage Captioning | CodeCode Available | 0 |
| NEVLP: Noise-Robust Framework for Efficient Vision-Language Pre-training | Sep 15, 2024 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| Locality-aware Cross-modal Correspondence Learning for Dense Audio-Visual Events Localization | Sep 12, 2024 | cross-modal alignment | —Unverified | 0 |
| GALLa: Graph Aligned Large Language Models for Improved Source Code Understanding | Sep 6, 2024 | cross-modal alignmentLanguage Modelling | —Unverified | 0 |
| Temporal Order Preserved Optimal Transport-based Cross-modal Knowledge Transfer Learning for ASR | Sep 3, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Focus on Focus: Focus-oriented Representation Learning and Multi-view Cross-modal Alignment for Glioma Grading | Aug 16, 2024 | Contrastive Learningcross-modal alignment | CodeCode Available | 0 |
| Cross-Modal Denoising: A Novel Training Paradigm for Enhancing Speech-Image Retrieval | Aug 15, 2024 | cross-modal alignmentDenoising | —Unverified | 0 |