Section: New Results
Introducing beam search techniques in the Earley algorithm
Participants : Pierre Boullier, Benoît Sagot.
In the context of the Sequoia project, Pierre Boullier and Benoît Sagot have been working on various techniques for reducing the search space of the Earley CFG parsing algorithm when using Probabilistic CFGs (PCFGs). These techniques have been implemented in the Syntax system, but have not been fully evaluated yet, nor published (this should be done in 2010).
In short, beam search techniques or variants thereof can be introduced at different stages of the Earley algorithm. In particular, given an Earley item, estimations and/or exact figures for the best probability of the prefix and the suffix of the item as well as for the parts at the right and at the left of the dot within the item, can be computed. This allows for various types of online item pruning, some of them being exact (i.e., the overall best parse will always be retained), some of them not. This work is crucial when dealing with huge grammars with a huge ambiguity, such as grammars generated by the Berkeley split-merge algorithm  .