Neural Machine Translation by Jointly Learning to Align and Translate

2014-09-01Code Available1· sign in to hype

Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio

Code Available — Be the first to reproduce this paper.

Code

github.com/graykode/nlp-tutorial
pytorch★ 14,880
github.com/awslabs/sockeye
mxnet★ 1,218
github.com/theamrzaki/text_summurization_abstractive_methods
tf★ 530
github.com/IS5882/Open-CyKG
tf★ 89
github.com/Nick-Zhao-Engr/Machine-Translation
pytorch★ 17
github.com/yurayli/stanford-cs224n-sol
pytorch★ 6
github.com/distractor-generation/dg_survey
none★ 4
github.com/hiun/learning-transformers
pytorch★ 3
github.com/xingniu/sockeye
mxnet★ 3
github.com/simonjisu/NMT
pytorch★ 2

Abstract

Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belong to a family of encoder-decoders and consists of an encoder that encodes a source sentence into a fixed-length vector from which a decoder generates a translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. With this new approach, we achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.

Tasks

Bangla Spelling Error Correction Decoder Dialogue Generation Machine Translation Sentence Translation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Persona-Chat	Seq2Seq + Attention	Avg F1	16.18	—	Unverified

Neural Machine Translation by Jointly Learning to Align and Translate

Code

Abstract

Tasks

Benchmark Results

Reproductions