**Abstract:** |
We study the problem of finding the optimal hyperplane for inducing a binary partition of a multivariate dataset. In this talk we consider two approaches to this problem. The first is motivated by 'high density clustering', in which different classes are assumed to arise uniquely from the modes of a probability density function. The optimal hyperplane is that which has the minimum density along it. The second is motivated by normalised graph partitioning. This is a combinatorial problem for which no known efficient algorithms exist. The continuous relaxation of the problem can, however, be solved by spectral clustering. The optimal hyperplane in this case minimises the 'algebraic' or 'spectral' connectedness of the data projected onto the unit vector normal to the hyperplane. While different in motivation, these two approaches are connected in that asymptotically, as their smoothing parameters tend to zero, they both converge to the largest margin hyperplane through the data. To solve these problems, we formulate them in the context of 'projection pursuit', in which a projection index is optimised over the space of projection vectors. In our case the projection indices for a projection, v, are the minimum density admitted by a hyperplane orthogonal to v, and the second eigenvalue of the graph Laplacian of the data projected onto v. Both projection indices are Lipschitz continuous, and continuously differentiable for almost all v for any continuous sampling distribution. This allows us to use modern techniques for the optimisation of non-smooth functions in order to find locally optimal solutions. |