SOTAVerified

Sentence Compression for Arbitrary Languages via Multilingual Pivoting

2018-10-01EMNLP 2018Code Available0· sign in to hype

Jonathan Mallinson, Rico Sennrich, Mirella Lapata

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

In this paper we advocate the use of bilingual corpora which are abundantly available for training sentence compression models. Our approach borrows much of its machinery from neural machine translation and leverages bilingual pivoting: compressions are obtained by translating a source string into a foreign language and then back-translating it into the source while controlling the translation length. Our model can be trained for any language as long as a bilingual corpus is available and performs arbitrary rewrites without access to compression specific data. We release. Moss, a new parallel Multilingual Compression dataset for English, German, and French which can be used to evaluate compression models across languages and genres.

Tasks

Reproductions