Joyful: Joint Modality Fusion and Graph Contrastive Learning for Multimodal Emotion Recognition

2023-11-18Code Available1· sign in to hype

Dongyuan Li, Yusong Wang, Kotaro Funakoshi, Manabu Okumura

Code Available — Be the first to reproduce this paper.

Code

github.com/wykstc/MERC-main
Officialpytorch★ 18

Abstract

Multimodal emotion recognition aims to recognize emotions for each utterance of multiple modalities, which has received increasing attention for its application in human-machine interaction. Current graph-based methods fail to simultaneously depict global contextual features and local diverse uni-modal features in a dialogue. Furthermore, with the number of graph layers increasing, they easily fall into over-smoothing. In this paper, we propose a method for joint modality fusion and graph contrastive learning for multimodal emotion recognition (Joyful), where multimodality fusion, contrastive learning, and emotion recognition are jointly optimized. Specifically, we first design a new multimodal fusion mechanism that can provide deep interaction and fusion between the global contextual and uni-modal specific features. Then, we introduce a graph contrastive learning framework with inter-view and intra-view contrastive losses to learn more distinguishable representations for samples with different sentiments. Extensive experiments on three benchmark datasets indicate that Joyful achieved state-of-the-art (SOTA) performance compared to all baselines.

Tasks

Contrastive Learning Emotion Recognition Emotion Recognition in Conversation Face Swapping Multimodal Emotion Recognition

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
IEMOCAP	Joyful	Weighted F1	70.5	—	Unverified
IEMOCAP-4	Joyful	Weighted F1	85.7	—	Unverified
MELD	Joyful	Weighted F1	61.77	—	Unverified

Joyful: Joint Modality Fusion and Graph Contrastive Learning for Multimodal Emotion Recognition

Code

Abstract

Tasks

Benchmark Results

Reproductions