Generating Paraphrases with Lean Vocabulary

2019-10-01WS 2019Unverified0· sign in to hype

Tadashi Nomoto

Unverified — Be the first to reproduce this paper.

Abstract

In this work, we examine whether it is possible to achieve the state of the art performance in paraphrase generation with reduced vocabulary. Our approach consists of building a convolution to sequence model (Conv2Seq) partially guided by the reinforcement learning, and training it on the subword representation of the input. The experiment on the Quora dataset, which contains over 140,000 pairs of sentences and corresponding paraphrases, found that with less than 1,000 token types, we were able to achieve performance which exceeded that of the current state of the art.

Tasks

Paraphrase Generation reinforcement-learning Reinforcement Learning Reinforcement Learning (RL)

Generating Paraphrases with Lean Vocabulary

Abstract

Tasks

Reproductions