Recurrent Attention for the Transformer

2021-11-01EMNLP (insights) 2021Unverified0· sign in to hype

Jan Rosendahl, Christian Herold, Frithjof Petrick, Hermann Ney

Unverified — Be the first to reproduce this paper.

Abstract

In this work, we conduct a comprehensive investigation on one of the centerpieces of modern machine translation systems: the encoder-decoder attention mechanism. Motivated by the concept of first-order alignments, we extend the (cross-)attention mechanism by a recurrent connection, allowing direct access to previous attention/alignment decisions. We propose several ways to include such a recurrency into the attention mechanism. Verifying their performance across different translation tasks we conclude that these extensions and dependencies are not beneficial for the translation performance of the Transformer architecture.

Tasks

Decoder Machine Translation Translation

Recurrent Attention for the Transformer

Abstract

Tasks

Reproductions