SOTAVerified

Multi-modal Classification

Papers

Showing 131 of 31 papers

TitleStatusHype
Contrastive Audio-Visual Masked AutoencoderCode2
Multimodal Learning with Uncertainty Quantification based on Discounted Belief FusionCode1
PromptStyler: Prompt-driven Style Generation for Source-free Domain GeneralizationCode1
FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion TasksCode1
UAVM: Towards Unifying Audio and Visual ModelsCode1
Multimodal Dynamics: Dynamical Fusion for Trustworthy Multimodal ClassificationCode1
Multi-modal Sarcasm Detection and Humor Classification in Code-mixed ConversationsCode1
Lightweight Joint Audio-Visual Deepfake Detection via Single-Stream Multi-Modal Learning Framework0
A Survey on Training-free Open-Vocabulary Semantic Segmentation0
A Comparative Study of Human Activity Recognition: Motion, Tactile, and multi-modal Approaches0
Multi-modal classification of forest biodiversity potential from 2D orthophotos and 3D airborne laser scanning point clouds0
Hateful Meme Detection through Context-Sensitive Prompting and Fine-Grained LabelingCode0
Turbo your multi-modal classification with contrastive learning0
FungiTastic: A multi-modal dataset and benchmark for image categorization0
Language Augmentation in CLIP for Improved Anatomy Detection on Multi-modal Medical Images0
Joint-Individual Fusion Structure with Fusion Attention Module for Multi-Modal Skin Cancer Classification0
AVT: Audio-Video Transformer for Multimodal Action Recognition0
Multiscale Multimodal Transformer for Multimodal Action Recognition0
Multi-Modal Hypergraph Diffusion Network with Dual Prior for Alzheimer Classification0
On Modality Bias Recognition and ReductionCode0
Multi Task Learning based Framework for Multimodal Classification0
Cross-Modal Retrieval Augmentation for Multi-Modal Classification0
Learning Multi-Modal Nonlinear Embeddings: Performance Bounds and an Algorithm0
Look, Read and Enrich. Learning from Scientific Figures and their CaptionsCode0
Assessing Post Deletion in Sina Weibo: Multi-modal Classification of Hot Topics0
What Makes Training Multi-Modal Classification Networks Hard?Code0
Image and Encoded Text Fusion for Multi-Modal ClassificationCode0
CuisineNet: Food Attributes Classification using Multi-scale Convolution Network0
Efficient Large-Scale Multi-Modal Classification0
Deep Multi-Modal Classification of Intraductal Papillary Mucinous Neoplasms (IPMN) with Canonical Correlation Analysis0
Multi-modal Fusion for Diabetes Mellitus and Impaired Glucose Regulation Detection0
Show:102550

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MMTTop-1 Accuracy66.2Unverified
2CAV-MAE (Audio-Visual)Top-1 Accuracy65.9Unverified
3UAVMTop-1 Accuracy65.8Unverified
4AVTTop-1 Accuracy63.9Unverified
#ModelMetricClaimedVerifiedStatus
1CAV-MAEAverage mAP0.51Unverified
2UAVMAverage mAP0.5Unverified