August 7, 2003
Noon - 1:30 pm
Intel Seminar (417 S. Craig Street - 3rd Floor)
EVENTS PAGE: http://www.intel-research.net/pittsburgh/events.htm
Information Theoretic Feature Selection for Large-scale Classification
Feature selection remains a challenging problem for large-scale classification
problems, involving large numbers of classes and significant amounts of
training data per class, such as visual recognition, speech recognition,
or information retrieval. In this work, we introduce some new connections
between information theoretic (infomax) feature selection methods and
the Bayes classification error to develop a new family of feature selection
algorithms. The concept of marginal diversity is introduced, leading to
a discriminant feature selection principle of extreme computational simplicity.
The relationships between infomax and maximization of marginal diversity
are studied, uncovering a family of classification problems for which
infomax-optimal feature selection does not require combinatorial search.
An analysis of this family in light of recent studies on the statistics
of natural images suggests a generalization of the principle of maximum
marginal diversity that allows explicit control of the trade-off between
complexity and infomax-optimality. Experimental results, in the context
of visual recognition, indicate that the optimal trade-off occurs at low-levels
of complexity. The corresponding algorithm is shown to significantly outperform
existing scalable feature selection techniques.
Nuno Vasconcelos received a Licenciatura from the University
of Porto, Portugal, a SM and a PhD from MIT. He was a member of the research
staff at the Compaq Cambridge Research Laboratory, and then HP Cambridge
Research Laboratory, between 2000 and 2003. In March 2003, he joined the
Department of Electrical and Computer Engineering at the University of
California, San Diego, where he is an assistant professor. His interests
are in machine vision, machine learning, statistical signal processing,
Contact Kim Kaan, 412-605-1203,
or visit http://www.intel-research.net.
SDI Home: http://www.pdl.cmu.edu/SDI/