The linear support vector machine can be posed as a quadratic program in a variety of ways. In this paper, we look at a formulation using the two-norm for the misclassification error that leads to a positive definite quadratic program with a single equality constraint when the Wolfe dual is taken. The quadratic term is a small rank update to a positive definite matrix. We reformulate the optimality conditions as a semismooth system of equations using the Fischer-Burmeister function and apply a damped Newton method to solve the resulting problem. The algorithm is shown to converge from any starting point with a Q-quadratic rate of convergence. At each iteration, we use the Sherman-Morrison-Woodbury update formula to solve a single linear system of equations. Significant computational savings are realized as the inactive variables are identified and exploited during the solution process. Results for a 60 million variable problem are presented, demonstrating the effectiveness of the proposed method on a personal computer.
Data Mining Institute Technical Report 00-09, Computer Sciences Department, University of Wisconsin, Madison, 2000