Bilevel Hyperparameter Optimization for Nonlinear Support Vector Machines

While the problem of tuning the hyperparameters of a support vector machine (SVM) via
cross-validation is easily understood as a bilevel optimization problem, so far, the corresponding
literature has mainly focused on the linear-kernel case. In this paper, we establish
a theoretical framework for the development of bilevel optimization-based methods for tuning
the hyperparameters of an SVM in the case where a nonlinear kernel is adopted, which
affords the ability to capture highly-complex relationships between the points in the data
set. By leveraging a Karush-Kuhn-Tucker (KKT)/mathematical program with equilibrium
constraints (MPEC) reformulation of the (lower-level) training problem, we develop a
theoretical framework for the SVM hyperparameter-tuning problem that established under
which assumptions and conditions suitable qualification conditions including the Mangasarian–
Fromovitz, the linear-independence, and the strong second order sufficient conditions
are satisfied. We then illustrate the need for this theoretical framework in the context
of the well-known Scholtes relaxation algorithm for solving the MPEC reformulation of
our bilevel hyperparameter problem for SVMs. Numerical experiments are conducted to
demonstrate the potential of this algorithm for examples of nonlinear SVM problems.

Article

Download