Bilevel Hyperparameter Optimization for Nonlinear Support Vector Machines

While the problem of tuning the hyperparameters of a support vector machine (SVM) via cross-validation is easily understood as a bilevel optimization problem, so far, the corresponding literature has mainly focused on the linear-kernel case. In this paper, we establish a theoretical framework for the development of bilevel optimization-based methods for tuning the hyperparameters of … Read more

Information Complexity of Mixed-integer Convex Optimization

We investigate the information complexity of mixed-integer convex optimization under different types of oracles. We establish new lower bounds for the standard first-order oracle, improving upon the previous best known lower bound. This leaves only a lower order linear term (in the dimension) as the gap between the lower and upper bounds. This is derived … Read more

Transformation of Bilevel Optimization Problems into Single-Level Ones

Bilevel optimization problems are hierarchical problems with a constraint set which is a subset of the graph of the solution set mapping of a second optimization problem. To investigate their properties and derive solution algorithms, their transformation into single-level ones is necessary. For this, various approaches have been developed. The rst and most often used … Read more

Duality of upper bounds in stochastic dynamic programming

For multistage stochastic programming problems with stagewise independent uncertainty, dynamic programming algorithms calculate polyhedral approximations for the value functions at each stage.  The SDDP algorithm provides piecewise linear lower bounds, in the spirit of the L-shaped algorithm, and corresponding upper bounds took a longer time to appear.  One strategy uses the primal dynamic programming recursion … Read more

Exact convergence rate of the last iterate in subgradient methods

We study the convergence of the last iterate in subgradient methods applied to the minimization of a nonsmooth convex function with bounded subgradients. We first introduce a proof technique that generalizes the standard analysis of subgradient methods. It is based on tracking the distance between the current iterate and a different reference point at each … Read more

Inexact Direct-Search Methods for Bilevel Optimization Problems

In this work, we introduce new direct search schemes for the solution of bilevel optimization (BO) problems. Our methods rely on a fixed accuracy black box oracle for the lower-level problem, and deal both with smooth and potentially nonsmooth true objectives. We thus analyze for the first time in the literature direct search schemes in … Read more

On the Computation of Restricted Normal Cones

Restricted normal cones are of interest, for instance, in the theory of local error bounds, where they have recently been used to characterize the exis- tence of a constrained Lipschitzian error bound. In this paper, we establish rela- tions between two concepts for restricted normals. The first of these concepts was introduced in the late … Read more

Inverse Optimization for Routing Problems

We propose a method for learning decision-makers’ behavior in routing problems using Inverse Optimization (IO). The IO framework falls into the supervised learning category and builds on the premise that the target behavior is an optimizer of an unknown cost function. This cost function is to be learned through historical data, and in the context … Read more

Sharpness and well-conditioning of nonsmooth convex formulations in statistical signal recovery

We study a sample complexity vs. conditioning tradeoff in modern signal recovery problems where convex optimization problems are built from sampled observations. We begin by introducing a set of condition numbers related to sharpness in \(\ell_p\) or Schatten-p norms (\(p\in[1,2]\)) based on nonsmooth reformulations of a class of convex optimization problems, including sparse recovery, low-rank … Read more

Provably Faster Gradient Descent via Long Steps

This work establishes provably faster convergence rates for gradient descent in smooth convex optimization via a computer-assisted analysis technique. Our theory allows nonconstant stepsize policies with frequent long steps potentially violating descent by analyzing the overall effect of many iterations at once rather than the typical one-iteration inductions used in most first-order method analyses. We … Read more