| Contrastive Audio-Visual Masked Autoencoder | Oct 2, 2022 | Audio ClassificationAudio Tagging | CodeCode Available | 2 |
| Multimodal Learning with Uncertainty Quantification based on Discounted Belief Fusion | Dec 23, 2024 | Decision MakingMulti-modal Classification | CodeCode Available | 1 |
| PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization | Jul 27, 2023 | Domain GeneralizationImage Classification | CodeCode Available | 1 |
| FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks | Mar 4, 2023 | Cross-Modal RetrievalImage Captioning | CodeCode Available | 1 |
| UAVM: Towards Unifying Audio and Visual Models | Jul 29, 2022 | Audio Classificationaudio-visual learning | CodeCode Available | 1 |
| Multimodal Dynamics: Dynamical Fusion for Trustworthy Multimodal Classification | Jan 1, 2022 | ClassificationInformativeness | CodeCode Available | 1 |
| Multi-modal Sarcasm Detection and Humor Classification in Code-mixed Conversations | May 20, 2021 | ClassificationMulti-modal Classification | CodeCode Available | 1 |
| Lightweight Joint Audio-Visual Deepfake Detection via Single-Stream Multi-Modal Learning Framework | Jun 9, 2025 | audio-visual learningDeepFake Detection | —Unverified | 0 |
| A Survey on Training-free Open-Vocabulary Semantic Segmentation | May 28, 2025 | Multi-modal ClassificationOpen Vocabulary Semantic Segmentation | —Unverified | 0 |
| A Comparative Study of Human Activity Recognition: Motion, Tactile, and multi-modal Approaches | May 13, 2025 | Activity RecognitionClassification | —Unverified | 0 |