Tech Report CS-06-06

A Cutting-Plane Algorithm for Learning from Ambiguous Examples

Stuart Andrews and Thomas Hofmann

April 2006

Abstract:

We present a novel algorithm developed within the risk minimization framework for learning from ambiguous and unlabeled examples as encountered in multi-instance, transductive and semi-supervised learning. For these problems, risk minimization involves optimization over a non-convex feasible region defined by the ambiguous examples. Our method generalizes 1-norm SVM. We adapts lift-and-project cutting-planes from integer-programming to approximate the convex hull of the feasible region to arbitrary precision. The resulting classifier is competitive on several benchmarks, and in particular, outperforms existing algorithms on text categorization tasks due to its feature selectivity. This algorithm demonstrates that risk minimization can be used to effectively disambiguate and learn from ambiguous examples.

(complete text in pdf)