A Statistical Syntactic Disambiguation Program and What It Learns
Murat Ersan and Eugene Charniak
We describe a program that uses statistical information on word-usage
to perform syntactic disambiguation, and show that the use of this
information significantly improves performance. The bulk of the
paper, however, attempts to answer the question: what did the program
learn that would account for this improvement? We show that the
program has learned many linguistically recognized forms of lexical
information, particularly verb case frames and prepositional
preferences for nouns and adjectives. We also show that viewed simply
as a learner of lexical information the program is also a success,
performing slightly better than hand-crafted learning programs for the
same tasks.