Hierarchical Fusion for Online Multimodal Dialog Act Classification

2023-12-08EMNLP 2023Code Available0· sign in to hype

Md Messal Monem Miah, Adarsh Pyarelal, Ruihong Huang

Code Available — Be the first to reproduce this paper.

Code

github.com/Dipto084/Hierarchical-Fusion-for-Online-Multimodal-Dialog-Act-Classification
OfficialIn paperpytorch★ 2

Abstract

We propose a framework for online multimodal dialog act (DA) classification based on raw audio and ASR-generated transcriptions of current and past utterances. Existing multimodal DA classification approaches are limited by ineffective audio modeling and late-stage fusion. We showcase significant improvements in multimodal DA classification by integrating modalities at a more granular level and incorporating recent advancements in large language and audio models for audio feature extraction. We further investigate the effectiveness of self-attention and cross-attention mechanisms in modeling utterances and dialogs for DA classification. We achieve a substantial increase of 3 percentage points in the F1 score relative to current state-of-the-art models on two prominent DA classification datasets, MRDA and EMOTyDA.

Tasks

Classification Dialog Act Classification Dialogue Act Classification

Hierarchical Fusion for Online Multimodal Dialog Act Classification

Code

Abstract

Tasks

Reproductions