A Maximum-Entropy-Inspired Parser
Eugene Charniak
We present a new parser for parsing down to Penn tree-bank style parse
trees that achieves 90.1% average precision/recall for sentences of
length 40 and less, and 89.5% for sentences of length 100 and less
when trained and tested on the previously established ``standard''
sections of the Wall Street Journal tree-bank. This represents a 15%
decrease in error rate over the best single-parser results on this
corpus. The major technical innovation in this parser is the use of a
``maximum-entropy-inspired'' model for conditioning and smoothing that
allowed us successfully to test and combine many different
conditioning events. We also present some partial results showing the
effects of different conditioning information, including a surprising
2% improvement due to guessing the lexical head's pre-terminal before
guessing the lexical head.