A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation

2016-03-19ACL 2016Code Available1· sign in to hype

Junyoung Chung, Kyunghyun Cho, Yoshua Bengio

Code Available — Be the first to reproduce this paper.

Code

github.com/nyu-dl/dl4mt-cdec
none★ 167
github.com/nyu-dl/dl4mt-c2c
none★ 0

Abstract

The existing machine translation systems, whether phrase-based or neural, have relied almost exclusively on word-level modelling with explicit segmentation. In this paper, we ask a fundamental question: can neural machine translation generate a character sequence without any explicit segmentation? To answer this question, we evaluate an attention-based encoder-decoder with a subword-level encoder and a character-level decoder on four language pairs--En-Cs, En-De, En-Ru and En-Fi-- using the parallel corpora from WMT'15. Our experiments show that the models with a character-level decoder outperform the ones with a subword-level decoder on all of the four language pairs. Furthermore, the ensembles of neural models with a character-level decoder outperform the state-of-the-art non-neural machine translation systems on En-Cs, En-De and En-Fi and perform comparably on En-Ru.

Tasks

Decoder de-en Machine Translation Segmentation Translation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
WMT2015 English-German	Enc-Dec Att (char)	BLEU score	23.5	—	Unverified
WMT2015 English-German	Enc-Dec Att (BPE)	BLEU score	21.7	—	Unverified

A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation

Code

Abstract

Tasks

Benchmark Results

Reproductions