SOTAVerified

Generating Paraphrases with Lean Vocabulary

2019-10-01WS 2019Unverified0· sign in to hype

Tadashi Nomoto

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this work, we examine whether it is possible to achieve the state of the art performance in paraphrase generation with reduced vocabulary. Our approach consists of building a convolution to sequence model (Conv2Seq) partially guided by the reinforcement learning, and training it on the subword representation of the input. The experiment on the Quora dataset, which contains over 140,000 pairs of sentences and corresponding paraphrases, found that with less than 1,000 token types, we were able to achieve performance which exceeded that of the current state of the art.

Tasks

Reproductions