Equations for Part-of-Speech Tagging
Eugene Charniak, Curtis Hendrickson,
Neil Jacobson,
and Mike Perkowitz
We derive from first principles the basic equations for a few of the
basic hidden-Markov-model word taggers as well as equations for other
models which may be novel (the descriptions in previous papers being
too spare to be sure). We give performance results for all of the
models. The results from our best model (96.45% on an unused test
sample from the Brown corpus with 181 distinct tags) is on the upper
edge of reported results. We also hope these results clear up some
confusion in the literature about the best equations to use. However,
the major purpose of this paper is to show how the equations for a
variety of models may be derived and thus encourage future authors to
give the equations for their model and the derivations thereof.