SOTAVerified

Multimodal Emotion Recognition

This is a leaderboard for multimodal emotion recognition on the IEMOCAP dataset. The modality abbreviations are A: Acoustic T: Text V: Visual

Please include the modality in the bracket after the model name.

All models must use standard five emotion categories and are evaluated in standard leave-one-session-out (LOSO). See the papers for references.

Papers

Showing 51100 of 180 papers

TitleStatusHype
Joyful: Joint Modality Fusion and Graph Contrastive Learning for Multimodal Emotion RecognitionCode1
Latent Distribution Decoupling: A Probabilistic Framework for Uncertainty-Aware Multimodal Emotion RecognitionCode1
FV2ES: A Fully End2End Multimodal System for Fast Yet Effective Video Emotion Recognition InferenceCode1
GA2MIF: Graph and Attention Based Two-Stage Multi-Source Information Fusion for Conversational Emotion DetectionCode1
DialogueRNN: An Attentive RNN for Emotion Detection in ConversationsCode1
GPT-4V with Emotion: A Zero-shot Benchmark for Generalized Emotion RecognitionCode1
Attentive Modality Hopping Mechanism for Speech Emotion RecognitionCode0
VISTANet: VIsual Spoken Textual Additive Net for Interpretable Multimodal Emotion RecognitionCode0
Learning Noise-Robust Joint Representation for Multimodal Emotion Recognition under Incomplete Data ScenariosCode0
Combining deep and unsupervised features for multilingual speech emotion recognitionCode0
Investigation of Multimodal Features, Classifiers and Fusion Methods for Emotion RecognitionCode0
Multi-Modal Emotion recognition on IEMOCAP Dataset using Deep LearningCode0
ICON: Interactive Conversational Memory Network for Multimodal Emotion DetectionCode0
Modality-Collaborative Transformer with Hybrid Feature Reconstruction for Robust Emotion RecognitionCode0
Multimodal Behavioral Markers Exploring Suicidal Intent in Social Media VideosCode0
Multi-level Fusion of Wav2vec 2.0 and BERT for Multimodal Emotion RecognitionCode0
Multimodal Emotion Recognition Using Deep Canonical Correlation AnalysisCode0
Context-Dependent Sentiment Analysis in User-Generated VideosCode0
TACFN: Transformer-based Adaptive Cross-modal Fusion Network for Multimodal Emotion RecognitionCode0
End-to-End Multimodal Emotion Recognition using Deep Neural NetworksCode0
Interpretable Multimodal Emotion Recognition using Hybrid Fusion of Speech and Image DataCode0
Complementary Fusion of Multi-Features and Multi-Modalities in Sentiment AnalysisCode0
Textualized and Feature-based Models for Compound Multimodal Emotion Recognition in the WildCode0
Leveraging Contrastive Learning and Self-Training for Multimodal Emotion Recognition with Limited Labeled SamplesCode0
Feature-Based Dual Visual Feature Extraction Model for Compound Multimodal Emotion RecognitionCode0
Multimodal Sentiment Analysis using Hierarchical Fusion with Context ModelingCode0
Multimodal Speech Emotion Recognition and Ambiguity ResolutionCode0
Multimodal Speech Emotion Recognition Using Audio and TextCode0
Multi Teacher Privileged Knowledge Distillation for Multimodal Expression RecognitionCode0
Learning Alignment for Multimodal Emotion Recognition from SpeechCode0
Multimodal Emotion Recognition with Vision-language Prompting and Modality Dropout0
Multimodal End-to-End Group Emotion Recognition using Cross-Modal Attention0
Multimodal Mixture of Low-Rank Experts for Sentiment Analysis and Emotion Recognition0
MVP: Multimodal Emotion Recognition based on Video and Physiological Signals0
Noise-Resistant Multimodal Transformer for Emotion Recognition0
Progressive Modality Reinforcement for Human Multimodal Emotion Recognition From Unaligned Multimodal Sequences0
PsyCounAssist: A Full-Cycle AI-Powered Psychological Counseling Assistant System0
Revisiting Disentanglement and Fusion on Modality and Context in Conversational Multimodal Emotion Recognition0
Revisiting Multimodal Emotion Recognition in Conversation from the Perspective of Graph Spectrum0
Seamless Multimodal Biometrics for Continuous Personalised Wellbeing Monitoring0
Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors0
Speech Emotion Recognition Based on Self-Attention Weight Correction for Acoustic and Text Features0
TACOformer:Token-channel compounded Cross Attention for Multimodal Emotion Recognition0
Towards Multimodal Emotion Recognition in German Speech Events in Cars using Transfer Learning0
UniMEEC: Towards Unified Multimodal Emotion Recognition and Emotion Cause0
Unimodal-driven Distillation in Multimodal Emotion Recognition with Dynamic Fusion0
Using Auxiliary Tasks In Multimodal Fusion Of Wav2vec 2.0 And BERT For Multimodal Emotion Recognition0
Using Large Pre-Trained Models with Cross-Modal Attention for Multi-Modal Emotion Recognition0
Versatile audio-visual learning for emotion recognition0
0/1 Deep Neural Networks via Block Coordinate Descent0
Show:102550
← PrevPage 2 of 4Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F186.52Unverified
2JoyfulWeighted F185.7Unverified
3COGMENWeighted F184.5Unverified
4DANNAccuracy82.7Unverified
5MMERAccuracy81.7Unverified
6PATHOSnet v2Accuracy80.4Unverified
7Self-attention weight correction (A+T)Accuracy76.8Unverified
8CHFusionAccuracy76.5Unverified
9bc-LSTMWeighted F174.1Unverified
10Audio + Text (Stage III)F170.5Unverified
#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F166.71Unverified
2Audio + Text (Stage III)Weighted F165.8Unverified
3JoyfulWeighted F161.77Unverified
#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F172.81Unverified
2JoyfulWeighted F170.5Unverified
#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F144.93Unverified
#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F166.73Unverified
#ModelMetricClaimedVerifiedStatus
1SMPLify-Xv2v error52.9Unverified
#ModelMetricClaimedVerifiedStatus
1GraphSmileWeighted F174.31Unverified