Almost-sure convergence of iterates and multipliers in stochastic sequential quadratic optimization

Stochastic sequential quadratic optimization (SQP) methods for solving continuous optimization problems with nonlinear equality constraints have attracted attention recently, such as for solving large-scale data-fitting problems subject to nonconvex constraints. However, for a recently proposed subclass of such methods that is built on the popular stochastic-gradient methodology from the unconstrained setting, convergence guarantees have been … Read more

Solution Path of Time-varying Markov Random Fields with Discrete Regularization

\(\) We study the problem of inferring sparse time-varying Markov random fields (MRFs) with different discrete and temporal regularizations on the parameters. Due to the intractability of discrete regularization, most approaches for solving this problem rely on the so-called maximum-likelihood estimation (MLE) with relaxed regularization, which neither results in ideal statistical properties nor scale to … Read more

Inexact Direct-Search Methods for Bilevel Optimization Problems

In this work, we introduce new direct search schemes for the solution of bilevel optimization (BO) problems. Our methods rely on a fixed accuracy black box oracle for the lower-level problem, and deal both with smooth and potentially nonsmooth true objectives. We thus analyze for the first time in the literature direct search schemes in … Read more

Structured Pruning of Neural Networks for Constraints Learning

In recent years, the integration of Machine Learning (ML) models with Operation Research (OR) tools has gained popularity across diverse applications, including cancer treatment, algorithmic configuration, and chemical process optimization. In this domain, the combination of ML and OR often relies on representing the ML model output using Mixed Integer Programming (MIP) formulations. Numerous studies … Read more

Sharpness and well-conditioning of nonsmooth convex formulations in statistical signal recovery

\(\) We study a sample complexity vs. conditioning tradeoff in modern signal recovery problems where convex optimization problems are built from sampled observations. We begin by introducing a set of condition numbers related to sharpness in \(\ell_p\) or Schatten-p norms (\(p\in[1,2]\)) based on nonsmooth reformulations of a class of convex optimization problems, including sparse recovery, … Read more

Shattering Inequalities for Learning Optimal Decision Trees

Recently, mixed-integer programming (MIP) techniques have been applied to learn optimal decision trees. Empirical research has shown that optimal trees typically have better out-of-sample performance than heuristic approaches such as CART. However, the underlying MIP formulations often suffer from weak linear programming (LP) relaxations. Many existing MIP approaches employ big-M constraints to ensure observations are … Read more

Gain Confidence, Reduce Disappointment: A New Approach to Cross-Validation for Sparse Regression

Ridge regularized sparse linear regression involves selecting a subset of features that explains the relationship between a high-dimensional design matrix and an output vector in an interpretable manner. To select the sparsity and robustness of linear regressors, techniques like leave-one-out cross-validation are commonly used for hyperparameter tuning. However, cross-validation typically increases the cost of sparse … Read more

A new upper bound of the Euclidean TSP constant

Let X1, X2, . . . , Xn be n independent and uniformly distributed random points in a compact region R ⊂ R2 of area 1. Let TSP(X1, . . . , Xn) denote the length of the optimal Euclidean traveling salesman tour that traverses all these points. The classical Beardwood-Halton-Hammersley theorem proves the existence … Read more

Nonlinear Distributionally Robust Optimization

This article focuses on a class of distributionally robust optimization (DRO) problems where, unlike the growing body of the literature, the objective function is potentially non-linear in the distribution. Existing methods to optimize nonlinear functions in probability space use the Frechet derivatives, which present both theoretical and computational challenges. Motivated by this, we propose an … Read more