SOTAVerified

Predictive Incremental Parsing Helps Language Modeling

2016-12-01COLING 2016Unverified0· sign in to hype

Arne K{\"o}hn, Timo Baumann

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Predictive incremental parsing produces syntactic representations of sentences as they are produced, e.g. by typing or speaking. In order to generate connected parses for such unfinished sentences, upcoming word types can be hypothesized and structurally integrated with already realized words. For example, the presence of a determiner as the last word of a sentence prefix may indicate that a noun will appear somewhere in the completion of that sentence, and the determiner can be attached to the predicted noun. We combine the forward-looking parser predictions with backward-looking N-gram histories and analyze in a set of experiments the impact on language models, i.e. stronger discriminative power but also higher data sparsity. Conditioning N-gram models, MaxEnt models or RNN-LMs on parser predictions yields perplexity reductions of about 6\%. Our method (a) retains online decoding capabilities and (b) incurs relatively little computational overhead which sets it apart from previous approaches that use syntax for language modeling. Our method is particularly attractive for modular systems that make use of a syntax parser anyway, e.g. as part of an understanding pipeline where predictive parsing improves language modeling at no additional cost.

Tasks

Reproductions