SOTAVerified

Lightweight Models for Multimodal Sequential Data

2021-04-01EACL (WASSA) 2021Unverified0· sign in to hype

Soumya Sourav, Jessica Ouyang

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Human language encompasses more than just text; it also conveys emotions through tone and gestures. We present a case study of three simple and efficient Transformer-based architectures for predicting sentiment and emotion in multimodal data. The Late Fusion model merges unimodal features to create a multimodal feature sequence, the Round Robin model iteratively combines bimodal features using cross-modal attention, and the Hybrid Fusion model combines trimodal and unimodal features together to form a final feature sequence for predicting sentiment. Our experiments show that our small models are effective and outperform the publicly released versions of much larger, state-of-the-art multimodal sentiment analysis systems.

Tasks

Reproductions