Tech Report CS-94-08

Combining Grammars For Improved Learning

Glenn Carroll and Eugene Charniak

February 1994

Abstract:

We report experimental work on improving learning methods for probabilistic context-free grammars (PCFGs). From stacked regression we borrow the basic idea of combining grammars. Smoothing, a domain-independent method for combining grammars, does not offer noticeable performance gains. However, PCFGs allow much tighter, domain-dependent coupling, and we show that this maybe exploited for significant performance gains. Finally, we compare two strategies for acquiring the varying grammars needed for any combining method. We suggest that an unorthodox strategy, ``leave-one-in'' learning, is more effective than the more familiar ``leave-one-out''.

(complete text in pdf or gzipped postscript)