SOTAVerified

Multimodal Emotion Recognition

This is a leaderboard for multimodal emotion recognition on the IEMOCAP dataset. The modality abbreviations are A: Acoustic T: Text V: Visual

Please include the modality in the bracket after the model name.

All models must use standard five emotion categories and are evaluated in standard leave-one-session-out (LOSO). See the papers for references.

Papers

Showing 101150 of 180 papers

TitleStatusHype
CSAT‑FTCN: A Fuzzy‑Oriented Model with Contextual Self‑attention Network for Multimodal Emotion Recognition0
Seamless Multimodal Biometrics for Continuous Personalised Wellbeing Monitoring0
Emotion Recognition with Pre-Trained Transformers Using Multimodal Signals0
Multimodal Emotion Recognition among Couples from Lab Settings to Daily Life using Smartwatches0
FAF: A novel multimodal emotion recognition approach integrating face, body and text0
Speech Emotion Recognition Based on Self-Attention Weight Correction for Acoustic and Text Features0
Multimodal Information Bottleneck: Learning Minimal Sufficient Unimodal and Multimodal RepresentationsCode1
Exploiting modality-invariant feature for robust multimodal emotion recognition with missing modalitiesCode1
Multilevel Transformer For Multimodal Emotion Recognition0
FV2ES: A Fully End2End Multimodal System for Fast Yet Effective Video Emotion Recognition InferenceCode1
Interpretable Multimodal Emotion Recognition using Hybrid Fusion of Speech and Image DataCode0
VISTANet: VIsual Spoken Textual Additive Net for Interpretable Multimodal Emotion RecognitionCode0
GA2MIF: Graph and Attention Based Two-Stage Multi-Source Information Fusion for Conversational Emotion DetectionCode1
Multimodal Emotion Recognition with Modality-Pairwise Unsupervised Contrastive LossCode1
A Multibias-mitigated and Sentiment Knowledge Enriched Transformer for Debiasing in Multimodal Conversational Emotion Recognition0
Multi-level Fusion of Wav2vec 2.0 and BERT for Multimodal Emotion RecognitionCode0
GraphCFC: A Directed Graph Based Cross-Modal Feature Complementation Approach for Multimodal Conversational Emotion RecognitionCode1
0/1 Deep Neural Networks via Block Coordinate Descent0
COLD Fusion: Calibrated and Ordinal Latent Distribution Fusion for Uncertainty-Aware Multimodal Emotion Recognition0
Do Multimodal Emotion Recognition Models Tackle Ambiguity?0
Bias and Fairness on Multimodal Emotion Detection Algorithms0
COGMEN: COntextualized GNN based Multimodal Emotion recognitioNCode1
MMER: Multimodal Multi-task Learning for Speech Emotion RecognitionCode1
A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion RecognitionCode1
Continuous-Time Audiovisual Fusion with Recurrence vs. Attention for In-The-Wild Affect Recognition0
Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responsesCode1
Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models0
Interpretability for Multimodal Emotion Recognition using Concept Activation Vectors0
Group Gated Fusion on Attention-based Bidirectional Alignment for Multimodal Emotion RecognitionCode1
A proposal for Multimodal Emotion Recognition using aural transformers and Action Units on RAVDESS datasetCode1
Shapes of Emotions: Multimodal Emotion Recognition in Conversations via Emotion ShiftsCode1
LMR-CBT: Learning Modality-fused Representations with CB-Transformer for Multimodal Emotion Recognition from Unaligned Multimodal Sequences0
Multimodal Emotion Recognition on RAVDESS Dataset Using Transfer Learning0
Multimodal End-to-End Group Emotion Recognition using Cross-Modal Attention0
Cross Attentional Audio-Visual Fusion for Dimensional Emotion RecognitionCode1
A cross-modal fusion network based on self-attention and residual structure for multimodal emotion recognitionCode1
MEmoBERT: Pre-training Model with Prompt-based Learning for Multimodal Emotion Recognition0
Multimodal Emotion-Cause Pair Extraction in Conversations0
Multimodal Emotion Recognition with High-level Speech and Text FeaturesCode1
Using Large Pre-Trained Models with Cross-Modal Attention for Multi-Modal Emotion Recognition0
Progressive Modality Reinforcement for Human Multimodal Emotion Recognition From Unaligned Multimodal Sequences0
Analyzing the Influence of Dataset Composition for Emotion Recognition0
Combining deep and unsupervised features for multilingual speech emotion recognitionCode0
MSAF: Multimodal Split Attention FusionCode1
Context-Dependent Domain Adversarial Neural Network for Multimodal Emotion Recognition0
Emotion recognition by fusing time synchronous and time asynchronous representations0
Multimodal Emotion Recognition with Transformer-Based Self Supervised Feature FusionCode1
An Audio-Video Deep and Transfer Learning Framework for Multimodal Emotion Recognition in the wild0
Modality-Transferable Emotion Embeddings for Low-Resource Multimodal Emotion RecognitionCode1
Jointly Fine-Tuning “BERT-like” Self Supervised Models to Improve Multimodal Speech Emotion RecognitionCode1
Show:102550
← PrevPage 3 of 4Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F186.52Unverified
2JoyfulWeighted F185.7Unverified
3COGMENWeighted F184.5Unverified
4DANNAccuracy82.7Unverified
5MMERAccuracy81.7Unverified
6PATHOSnet v2Accuracy80.4Unverified
7Self-attention weight correction (A+T)Accuracy76.8Unverified
8CHFusionAccuracy76.5Unverified
9bc-LSTMWeighted F174.1Unverified
10Audio + Text (Stage III)F170.5Unverified
#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F166.71Unverified
2Audio + Text (Stage III)Weighted F165.8Unverified
3JoyfulWeighted F161.77Unverified
#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F172.81Unverified
2JoyfulWeighted F170.5Unverified
#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F144.93Unverified
#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F166.73Unverified
#ModelMetricClaimedVerifiedStatus
1SMPLify-Xv2v error52.9Unverified
#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F174.31Unverified