SOTAVerified

Multi-modal Classification

Papers

Showing 131 of 31 papers

TitleStatusHype
Lightweight Joint Audio-Visual Deepfake Detection via Single-Stream Multi-Modal Learning Framework0
A Survey on Training-free Open-Vocabulary Semantic Segmentation0
A Comparative Study of Human Activity Recognition: Motion, Tactile, and multi-modal Approaches0
Multi-modal classification of forest biodiversity potential from 2D orthophotos and 3D airborne laser scanning point clouds0
Multimodal Learning with Uncertainty Quantification based on Discounted Belief FusionCode1
Hateful Meme Detection through Context-Sensitive Prompting and Fine-Grained LabelingCode0
Turbo your multi-modal classification with contrastive learning0
FungiTastic: A multi-modal dataset and benchmark for image categorization0
Language Augmentation in CLIP for Improved Anatomy Detection on Multi-modal Medical Images0
Joint-Individual Fusion Structure with Fusion Attention Module for Multi-Modal Skin Cancer Classification0
PromptStyler: Prompt-driven Style Generation for Source-free Domain GeneralizationCode1
FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion TasksCode1
Contrastive Audio-Visual Masked AutoencoderCode2
AVT: Audio-Video Transformer for Multimodal Action Recognition0
Multiscale Multimodal Transformer for Multimodal Action Recognition0
UAVM: Towards Unifying Audio and Visual ModelsCode1
Multi-Modal Hypergraph Diffusion Network with Dual Prior for Alzheimer Classification0
On Modality Bias Recognition and ReductionCode0
Multimodal Dynamics: Dynamical Fusion for Trustworthy Multimodal ClassificationCode1
Multi Task Learning based Framework for Multimodal Classification0
Multi-modal Sarcasm Detection and Humor Classification in Code-mixed ConversationsCode1
Cross-Modal Retrieval Augmentation for Multi-Modal Classification0
Learning Multi-Modal Nonlinear Embeddings: Performance Bounds and an Algorithm0
Look, Read and Enrich. Learning from Scientific Figures and their CaptionsCode0
Assessing Post Deletion in Sina Weibo: Multi-modal Classification of Hot Topics0
What Makes Training Multi-Modal Classification Networks Hard?Code0
Image and Encoded Text Fusion for Multi-Modal ClassificationCode0
CuisineNet: Food Attributes Classification using Multi-scale Convolution Network0
Efficient Large-Scale Multi-Modal Classification0
Deep Multi-Modal Classification of Intraductal Papillary Mucinous Neoplasms (IPMN) with Canonical Correlation Analysis0
Multi-modal Fusion for Diabetes Mellitus and Impaired Glucose Regulation Detection0
Show:102550

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MMTTop-1 Accuracy66.2Unverified
2CAV-MAE (Audio-Visual)Top-1 Accuracy65.9Unverified
3UAVMTop-1 Accuracy65.8Unverified
4AVTTop-1 Accuracy63.9Unverified
#ModelMetricClaimedVerifiedStatus
1CAV-MAEAverage mAP0.51Unverified
2UAVMAverage mAP0.5Unverified