SOTAVerified

Multilingual Lexical Simplification via Paraphrase Generation

2023-07-28Code Available0· sign in to hype

Kang Liu, Jipeng Qiang, Yun Li, Yunhao Yuan, Yi Zhu, Kaixun Hua

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Lexical simplification (LS) methods based on pretrained language models have made remarkable progress, generating potential substitutes for a complex word through analysis of its contextual surroundings. However, these methods require separate pretrained models for different languages and disregard the preservation of sentence meaning. In this paper, we propose a novel multilingual LS method via paraphrase generation, as paraphrases provide diversity in word selection while preserving the sentence's meaning. We regard paraphrasing as a zero-shot translation task within multilingual neural machine translation that supports hundreds of languages. After feeding the input sentence into the encoder of paraphrase modeling, we generate the substitutes based on a novel decoding strategy that concentrates solely on the lexical variations of the complex word. Experimental results demonstrate that our approach surpasses BERT-based methods and zero-shot GPT3-based method significantly on English, Spanish, and Portuguese.

Tasks

Reproductions