Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/graykode/nlp-tutorialpytorch★ 14,880
- github.com/awslabs/sockeyemxnet★ 1,218
- github.com/theamrzaki/text_summurization_abstractive_methodstf★ 530
- github.com/IS5882/Open-CyKGtf★ 89
- github.com/Nick-Zhao-Engr/Machine-Translationpytorch★ 17
- github.com/yurayli/stanford-cs224n-solpytorch★ 6
- github.com/distractor-generation/dg_surveynone★ 4
- github.com/hiun/learning-transformerspytorch★ 3
- github.com/xingniu/sockeyemxnet★ 3
- github.com/simonjisu/NMTpytorch★ 2
Abstract
Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belong to a family of encoder-decoders and consists of an encoder that encodes a source sentence into a fixed-length vector from which a decoder generates a translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. With this new approach, we achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| Persona-Chat | Seq2Seq + Attention | Avg F1 | 16.18 | — | Unverified |