| AlignVSR: Audio-Visual Cross-Modal Alignment for Visual Speech Recognition | Oct 21, 2024 | cross-modal alignmentspeech-recognition | CodeCode Available | 1 |
| LESS: Label-Efficient and Single-Stage Referring 3D Segmentation | Oct 17, 2024 | cross-modal alignmentInstance Segmentation | CodeCode Available | 1 |
| Boosting Masked ECG-Text Auto-Encoders as Discriminative Learners | Oct 3, 2024 | cross-modal alignment | CodeCode Available | 1 |
| MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension | Sep 20, 2024 | cross-modal alignmentReferring Expression | CodeCode Available | 1 |
| Beyond Uncertainty: Evidential Deep Learning for Robust Video Temporal Grounding | Aug 29, 2024 | cross-modal alignmentDeep Learning | CodeCode Available | 1 |
| A Survey on Facial Expression Recognition of Static and Dynamic Emotions | Aug 28, 2024 | cross-modal alignmentFacial Expression Recognition | CodeCode Available | 1 |
| Advancing Multi-grained Alignment for Contrastive Language-Audio Pre-training | Aug 15, 2024 | cross-modal alignment | CodeCode Available | 1 |
| Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment | Jul 18, 2024 | cross-modal alignmentCross-Modal Retrieval | CodeCode Available | 1 |
| Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning | Jul 16, 2024 | Caption Generationcross-modal alignment | CodeCode Available | 1 |
| Towards Bridging the Cross-modal Semantic Gap for Multi-modal Recommendation | Jul 7, 2024 | cross-modal alignmentMulti-modal Recommendation | CodeCode Available | 1 |