Support Vector Machines: A Geometric Point of View
Speaker: Prof. Sergios Theodoridis
University: Univ. of Athens
Room : A56
Abstract: Support Vector Machines have been established as one of the major classification and regression tools for Pattern Recognition and Signal Analysis. Over the last decade a number of theoretical arguments have been developed in order to justify their enhanced performance. The most widely known scenario is to look at them as maximum margin classifiers. Another approach is via learning theory arguments and the structural risk minimization principle, which leads to an optimal trade off between performance and complexity. An alternative path is to look at the cost function, associated with the SVMs, as a regularized minimizer that asymptotically tends to the Bayesian classifier. A less known viewpoint is the geometric one that leads to the notion of reduced convex hulls. For the non-separable class case, the SVM solution is shown to be equivalent with computing the minimum distance between two reduced versions of the original convex hulls that "encircle" the two classes (for the two class case). In this talk I will focus on the geometric approach and new results will be discussed concerning a) novel, necessary for our case, theorems concerning the structure and properties of the reduced convex hulls (RCH) and b) novel algorithms for computing the minimum distance between the resulting RCH?s. This problem is far from being trivial, since existing algorithms, which compute the minimum distance between convex hulls, rely on their respective extreme points. However, computing the extreme points of a reduced convex hull, as we have shown, is a computationally hard task of a combinatorial nature. A basic projection theorem, that we have shown, will be discussed that bypasses the combinatorial burden of the task and opens the way to employ geometric minimum distance algorithms to the SVM task. Most important, this theorem "respects" inner products, thus allowing to the well known kernel trick to be easily incorporated into the algorithmic schemes, making them appropriate for the general nonlinear non-separable problem. The derived geometric algorithms are much more efficient compared to the classical and widely used SMO algorithm and its versions. A number of tests with well known test beds have shown that, sometimes, a gain of an order of magnitude in the number of kernel computations, for similar error rates, can be achieved. Furthermore, the new schemes are closer to our intuitive understanding of an iterative algorithm in simple geometric arguments. The reported results are the outcome of a joint work with Dr M. Mavroforakis.