A Bag of Useful Tricks for Practical Neural Machine Translation: Embedding Layer Initialization and Large Batch Size

2017-11-01WS 2017Code Available0· sign in to hype

Masato Neishi, Jin Sakuma, Satoshi Tohda, Shonosuke Ishiwatari, Naoki Yoshinaga, Masashi Toyoda

Code Available — Be the first to reproduce this paper.

Code

github.com/nem6ishi/wat17
OfficialIn papertf★ 0

Abstract

In this paper, we describe the team UT-IIS's system and results for the WAT 2017 translation tasks. We further investigated several tricks including a novel technique for initializing embedding layers using only the parallel corpus, which increased the BLEU score by 1.28, found a practical large batch size of 256, and gained insights regarding hyperparameter settings. Ultimately, our system obtained a better result than the state-of-the-art system of WAT 2016. Our code is available on https://github.com/nem6ishi/wat17.

Tasks

Machine Translation Translation Word Embeddings

A Bag of Useful Tricks for Practical Neural Machine Translation: Embedding Layer Initialization and Large Batch Size

Code

Abstract

Tasks

Reproductions