SOTAVerified

Efficient Weighted Deduction Systems for Earley’s Algorithm

2022-01-16ACL ARR January 2022Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

The parsing algorithm of Earley (1970), as presented, has a runtime complexity of O(N^3G R) where N is the length of the sentence, G is the size of the grammar, and R is the number of productions in the grammar. This is unworkable for the large grammars that arise in natural language processing. Fortunately, the dynamic programming algorithm can be improved to run in time O(N^3G), matching the complexity of running CKY on a binarized version of G. Some of the necessary speed-ups have been presented in part or in full in various parts of the literature. However, there has been no unified, formal treatment that is written as a deduction system or covers the weighted case. We present such a treatment in terms of five proof rules that can be used in weighted deduction, which refine Earley's , and actions. We also provide a generalization of Earley's algorithm that uses a finite-state automaton to represent the grammar, and whose runtime is proportional to the size of the automaton (and the usual O(N^3) term), or more precisely the size of the portion of the automaton that is reached while parsing the input sentence. Further speed-ups can then be achieved by minimizing the automaton so that similar productions share transitions.

Tasks

Reproductions