| PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization | Jul 27, 2023 | Domain GeneralizationImage Classification | CodeCode Available | 1 |
| FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks | Mar 4, 2023 | Cross-Modal RetrievalImage Captioning | CodeCode Available | 1 |
| Contrastive Audio-Visual Masked Autoencoder | Oct 2, 2022 | Audio ClassificationAudio Tagging | CodeCode Available | 2 |
| AVT: Audio-Video Transformer for Multimodal Action Recognition | Sep 22, 2022 | Action RecognitionAudio Classification | —Unverified | 0 |
| Multiscale Multimodal Transformer for Multimodal Action Recognition | Sep 22, 2022 | Action RecognitionAudio Classification | —Unverified | 0 |
| UAVM: Towards Unifying Audio and Visual Models | Jul 29, 2022 | Audio Classificationaudio-visual learning | CodeCode Available | 1 |
| Multi-Modal Hypergraph Diffusion Network with Dual Prior for Alzheimer Classification | Apr 4, 2022 | Multi-modal Classification | —Unverified | 0 |
| On Modality Bias Recognition and Reduction | Feb 25, 2022 | Action RecognitionMulti-modal Classification | CodeCode Available | 0 |
| Multimodal Dynamics: Dynamical Fusion for Trustworthy Multimodal Classification | Jan 1, 2022 | ClassificationInformativeness | CodeCode Available | 1 |
| Multi Task Learning based Framework for Multimodal Classification | Jun 1, 2021 | ClassificationMulti-modal Classification | —Unverified | 0 |