Sequence-to-Sequence Learning as Beam-Search Optimization

2016-06-09EMNLP 2016Code Available0· sign in to hype

Sam Wiseman, Alexander M. Rush

Code Available — Be the first to reproduce this paper.

Code

github.com/harvardnlp/BSO
OfficialIn papernone★ 0
github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning
pytorch★ 0
github.com/AndreiMoraru123/ContextCollector
pytorch★ 0

Abstract

Sequence-to-Sequence (seq2seq) modeling has rapidly become an important general-purpose NLP tool that has proven effective for many text-generation and sequence-labeling tasks. Seq2seq builds on deep neural language modeling and inherits its remarkable accuracy in estimating local, next-word distributions. In this work, we introduce a model and beam-search training scheme, based on the work of Daume III and Marcu (2005), that extends seq2seq to learn global sequence scores. This structured approach avoids classical biases associated with local training and unifies the training loss with the test-time usage, while preserving the proven model architecture of seq2seq and its efficient training approach. We show that our system outperforms a highly-optimized attention-based seq2seq system and other baselines on three different sequence to sequence tasks: word ordering, parsing, and machine translation.

Tasks

Language Modeling Language Modelling Machine Translation Text Generation Translation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
IWSLT2015 German-English	Word-level CNN w/attn, input feeding	BLEU score	24	—	Unverified

Sequence-to-Sequence Learning as Beam-Search Optimization

Code

Abstract

Tasks

Benchmark Results

Reproductions