SOTAVerified

Multimodal Emotion Recognition

This is a leaderboard for multimodal emotion recognition on the IEMOCAP dataset. The modality abbreviations are A: Acoustic T: Text V: Visual

Please include the modality in the bracket after the model name.

All models must use standard five emotion categories and are evaluated in standard leave-one-session-out (LOSO). See the papers for references.

Papers

Showing 126150 of 180 papers

TitleStatusHype
Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responsesCode1
Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models0
Interpretability for Multimodal Emotion Recognition using Concept Activation Vectors0
Group Gated Fusion on Attention-based Bidirectional Alignment for Multimodal Emotion RecognitionCode1
A proposal for Multimodal Emotion Recognition using aural transformers and Action Units on RAVDESS datasetCode1
Shapes of Emotions: Multimodal Emotion Recognition in Conversations via Emotion ShiftsCode1
LMR-CBT: Learning Modality-fused Representations with CB-Transformer for Multimodal Emotion Recognition from Unaligned Multimodal Sequences0
Multimodal Emotion Recognition on RAVDESS Dataset Using Transfer Learning0
Multimodal End-to-End Group Emotion Recognition using Cross-Modal Attention0
Cross Attentional Audio-Visual Fusion for Dimensional Emotion RecognitionCode1
A cross-modal fusion network based on self-attention and residual structure for multimodal emotion recognitionCode1
MEmoBERT: Pre-training Model with Prompt-based Learning for Multimodal Emotion Recognition0
Multimodal Emotion-Cause Pair Extraction in Conversations0
Multimodal Emotion Recognition with High-level Speech and Text FeaturesCode1
Using Large Pre-Trained Models with Cross-Modal Attention for Multi-Modal Emotion Recognition0
Progressive Modality Reinforcement for Human Multimodal Emotion Recognition From Unaligned Multimodal Sequences0
Analyzing the Influence of Dataset Composition for Emotion Recognition0
Combining deep and unsupervised features for multilingual speech emotion recognitionCode0
MSAF: Multimodal Split Attention FusionCode1
Context-Dependent Domain Adversarial Neural Network for Multimodal Emotion Recognition0
Emotion recognition by fusing time synchronous and time asynchronous representations0
Multimodal Emotion Recognition with Transformer-Based Self Supervised Feature FusionCode1
An Audio-Video Deep and Transfer Learning Framework for Multimodal Emotion Recognition in the wild0
Modality-Transferable Emotion Embeddings for Low-Resource Multimodal Emotion RecognitionCode1
Jointly Fine-Tuning “BERT-like” Self Supervised Models to Improve Multimodal Speech Emotion RecognitionCode1
Show:102550
← PrevPage 6 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F186.52Unverified
2JoyfulWeighted F185.7Unverified
3COGMENWeighted F184.5Unverified
4DANNAccuracy82.7Unverified
5MMERAccuracy81.7Unverified
6PATHOSnet v2Accuracy80.4Unverified
7Self-attention weight correction (A+T)Accuracy76.8Unverified
8CHFusionAccuracy76.5Unverified
9bc-LSTMWeighted F174.1Unverified
10Audio + Text (Stage III)F170.5Unverified
#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F166.71Unverified
2Audio + Text (Stage III)Weighted F165.8Unverified
3JoyfulWeighted F161.77Unverified
#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F172.81Unverified
2JoyfulWeighted F170.5Unverified
#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F144.93Unverified
#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F166.73Unverified
#ModelMetricClaimedVerifiedStatus
1SMPLify-Xv2v error52.9Unverified
#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F174.31Unverified