Katarina Domijan, Trinity College Dublin

Title:  "VARIABLE SELECTION VIA DECISION THEORY FOR A BAYESIAN KERNEL CLASSIFICATION METHOD"

with Simon P. Wilson.

Abstract:

We consider the problem of optimal Bayesian design for a variable
selection problem in a multi-category classification model based
on the theory of reproducing kernel Hilbert spaces (RKHS). The
kernel classifier has a hierarchical prior structure and is
treated with a fully Bayesian inference procedure. The Gibbs
sampler is implemented to find the posterior distributions of the
parameters. One nice side-effect of this approach is that it
allows us to compute samples from the distributions of the class
probabilities, which give a more complete picture of the
classification. Kernel classifiers are typically aimed at $n<<p$
problems and we apply the approach to some benchmark data sets as
well as a highly dimensional micro-array data set. Another nice
feature of the model is that it achieves good classification
results without any pre-processing or data reduction. However, the
kernel classifier is highly dimensional and over-parameterized,
hence we consider Bayesian decision theory to reduce its
complexity. The decision space is the set of all possible subsets
of variables, which is discrete and highly dimensional. This
translates to a stochastic optimization problem, where the optimal
decision maximizes the expected utility function with respect to
all the unknowns, including the model parameters and future data.
We consider some stochastic optimization methods, in particular
Taboo Search (\cite{glover77}) and Inhomogenous MCMC
(\cite{muller04}),a method closely related to simulated annealing
whose limiting distribution is the optimal solution. The
Inhomogenous MCMC is aimed at design problems, such us our
application, where the utility function is not available for
direct evaluation, instead it is computed through simulation.




P. Muller, B. Sanso and M. De Iorio.
 Optimal Bayesian Design by Inhomogeneous Markov Chain Simulation.
 {\em J. Am. Stat. Assoc.}, 99:467:788--798, 2004.

F. Glover.
Heuristics for Integer Programming Using Surrogate Constraints.
{\em Decision Sciences}, 8:156--166, 1977.