Grammar as a Foreign Language
Oriol Vinyals, Lukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, Geoffrey Hinton
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/gongshuangshuang/deep-text-correctortf★ 1
- github.com/kmadathil/sanskrit_parsernone★ 0
- github.com/KaustuvDash/DeepTextCorrectornone★ 0
- github.com/atpaino/deep-text-correctortf★ 0
- github.com/sunnysinghnitb/text_corrector_softwaretf★ 0
- github.com/sunnysinghnitb/text-corrector-softwaretf★ 0
- github.com/sambit9238/deep_text_correctortf★ 0
Abstract
Syntactic constituency parsing is a fundamental problem in natural language processing and has been the subject of intensive research and engineering for decades. As a result, the most accurate parsers are domain specific, complex, and inefficient. In this paper we show that the domain agnostic attention-enhanced sequence-to-sequence model achieves state-of-the-art results on the most widely used syntactic constituency parsing dataset, when trained on a large synthetic corpus that was annotated using existing parsers. It also matches the performance of standard parsers when trained only on a small human-annotated dataset, which shows that this model is highly data-efficient, in contrast to sequence-to-sequence models without the attention mechanism. Our parser is also fast, processing over a hundred sentences per second with an unoptimized CPU implementation.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| Penn Treebank | Semi-supervised LSTM | F1 score | 92.1 | — | Unverified |