| Contrastive Audio-Visual Masked Autoencoder | Oct 2, 2022 | Audio ClassificationAudio Tagging | CodeCode Available | 2 |
| Multimodal Learning with Uncertainty Quantification based on Discounted Belief Fusion | Dec 23, 2024 | Decision MakingMulti-modal Classification | CodeCode Available | 1 |
| PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization | Jul 27, 2023 | Domain GeneralizationImage Classification | CodeCode Available | 1 |
| FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks | Mar 4, 2023 | Cross-Modal RetrievalImage Captioning | CodeCode Available | 1 |
| UAVM: Towards Unifying Audio and Visual Models | Jul 29, 2022 | Audio Classificationaudio-visual learning | CodeCode Available | 1 |
| Multimodal Dynamics: Dynamical Fusion for Trustworthy Multimodal Classification | Jan 1, 2022 | ClassificationInformativeness | CodeCode Available | 1 |
| Multi-modal Sarcasm Detection and Humor Classification in Code-mixed Conversations | May 20, 2021 | ClassificationMulti-modal Classification | CodeCode Available | 1 |
| Lightweight Joint Audio-Visual Deepfake Detection via Single-Stream Multi-Modal Learning Framework | Jun 9, 2025 | audio-visual learningDeepFake Detection | —Unverified | 0 |
| A Survey on Training-free Open-Vocabulary Semantic Segmentation | May 28, 2025 | Multi-modal ClassificationOpen Vocabulary Semantic Segmentation | —Unverified | 0 |
| A Comparative Study of Human Activity Recognition: Motion, Tactile, and multi-modal Approaches | May 13, 2025 | Activity RecognitionClassification | —Unverified | 0 |
| Multi-modal classification of forest biodiversity potential from 2D orthophotos and 3D airborne laser scanning point clouds | Jan 3, 2025 | Multi-modal Classification | —Unverified | 0 |
| Hateful Meme Detection through Context-Sensitive Prompting and Fine-Grained Labeling | Nov 13, 2024 | Model OptimizationMulti-modal Classification | CodeCode Available | 0 |
| Turbo your multi-modal classification with contrastive learning | Sep 14, 2024 | ClassificationContrastive Learning | —Unverified | 0 |
| FungiTastic: A multi-modal dataset and benchmark for image categorization | Aug 24, 2024 | ClassificationFew-Shot Learning | —Unverified | 0 |
| Language Augmentation in CLIP for Improved Anatomy Detection on Multi-modal Medical Images | May 31, 2024 | AnatomyImage Description | —Unverified | 0 |
| Joint-Individual Fusion Structure with Fusion Attention Module for Multi-Modal Skin Cancer Classification | Dec 7, 2023 | Cancer ClassificationClassification | —Unverified | 0 |
| AVT: Audio-Video Transformer for Multimodal Action Recognition | Sep 22, 2022 | Action RecognitionAudio Classification | —Unverified | 0 |
| Multiscale Multimodal Transformer for Multimodal Action Recognition | Sep 22, 2022 | Action RecognitionAudio Classification | —Unverified | 0 |
| Multi-Modal Hypergraph Diffusion Network with Dual Prior for Alzheimer Classification | Apr 4, 2022 | Multi-modal Classification | —Unverified | 0 |
| On Modality Bias Recognition and Reduction | Feb 25, 2022 | Action RecognitionMulti-modal Classification | CodeCode Available | 0 |
| Multi Task Learning based Framework for Multimodal Classification | Jun 1, 2021 | ClassificationMulti-modal Classification | —Unverified | 0 |
| Cross-Modal Retrieval Augmentation for Multi-Modal Classification | Apr 16, 2021 | ClassificationCross-Modal Retrieval | —Unverified | 0 |
| Learning Multi-Modal Nonlinear Embeddings: Performance Bounds and an Algorithm | Jun 3, 2020 | cross-modal alignmentGeneral Classification | —Unverified | 0 |
| Look, Read and Enrich. Learning from Scientific Figures and their Captions | Sep 19, 2019 | Multi-modal ClassificationQuestion Answering | CodeCode Available | 0 |
| Assessing Post Deletion in Sina Weibo: Multi-modal Classification of Hot Topics | Jun 26, 2019 | General ClassificationMulti-modal Classification | —Unverified | 0 |
| What Makes Training Multi-Modal Classification Networks Hard? | May 29, 2019 | Action ClassificationAction Recognition | CodeCode Available | 0 |
| Image and Encoded Text Fusion for Multi-Modal Classification | Oct 3, 2018 | ClassificationGeneral Classification | CodeCode Available | 0 |
| CuisineNet: Food Attributes Classification using Multi-scale Convolution Network | May 30, 2018 | ClassificationCultural Vocal Bursts Intensity Prediction | —Unverified | 0 |
| Efficient Large-Scale Multi-Modal Classification | Feb 6, 2018 | ClassificationComputational Efficiency | —Unverified | 0 |
| Deep Multi-Modal Classification of Intraductal Papillary Mucinous Neoplasms (IPMN) with Canonical Correlation Analysis | Oct 26, 2017 | General ClassificationMulti-modal Classification | —Unverified | 0 |
| Multi-modal Fusion for Diabetes Mellitus and Impaired Glucose Regulation Detection | Apr 12, 2016 | Multi-modal Classification | —Unverified | 0 |