Explore Better Relative Position Embeddings from Encoding Perspective for Transformer Models

2021-11-01EMNLP 2021Code Available0· sign in to hype

Anlin Qu, Jianwei Niu, Shasha Mo

Code Available — Be the first to reproduce this paper.

Code

github.com/menghuanlater/lfhc-gcdf-rpe
OfficialIn paperpytorch★ 1

Abstract

Relative position embedding (RPE) is a successful method to explicitly and efficaciously encode position information into Transformer models. In this paper, we investigate the potential problems in Shaw-RPE and XL-RPE, which are the most representative and prevalent RPEs, and propose two novel RPEs called Low-level Fine-grained High-level Coarse-grained (LFHC) RPE and Gaussian Cumulative Distribution Function (GCDF) RPE. LFHC-RPE is an improvement of Shaw-RPE, which enhances the perception ability at medium and long relative positions. GCDF-RPE utilizes the excellent properties of the Gaussian function to amend the prior encoding mechanism in XL-RPE. Experimental results on nine authoritative datasets demonstrate the effectiveness of our methods empirically. Furthermore, GCDF-RPE achieves the best overall performance among five different RPEs.

Tasks

Position

Explore Better Relative Position Embeddings from Encoding Perspective for Transformer Models

Code

Abstract

Tasks

Reproductions