Improving the Diversity of Unsupervised Paraphrasing with Embedding Outputs

2021-10-25EMNLP (MRL) 2021Code Available0· sign in to hype

Monisha Jegadeesan, Sachin Kumar, John Wieting, Yulia Tsvetkov

Code Available — Be the first to reproduce this paper.

Code

github.com/monisha-jega/paraphrasing_embedding_outputs
OfficialIn paperpytorch★ 1

Abstract

We present a novel technique for zero-shot paraphrase generation. The key contribution is an end-to-end multilingual paraphrasing model that is trained using translated parallel corpora to generate paraphrases into "meaning spaces" -- replacing the final softmax layer with word embeddings. This architectural modification, plus a training procedure that incorporates an autoencoding objective, enables effective parameter sharing across languages for more fluent monolingual rewriting, and facilitates fluency and diversity in generation. Our continuous-output paraphrase generation models outperform zero-shot paraphrasing baselines when evaluated on two languages using a battery of computational metrics as well as in human assessment.

Tasks

Diversity Paraphrase Generation Word Embeddings

Improving the Diversity of Unsupervised Paraphrasing with Embedding Outputs

Code

Abstract

Tasks

Reproductions