Inexact Direct-Search Methods for Bilevel Optimization Problems

In this work, we introduce new direct search schemes for the solution of bilevel optimization (BO) problems. Our methods rely on a fixed accuracy black box oracle for the lower-level problem, and deal both with smooth and potentially nonsmooth true objectives. We thus analyze for the first time in the literature direct search schemes in … Read more

Structured Pruning of Neural Networks for Constraints Learning

In recent years, the integration of Machine Learning (ML) models with Operation Research (OR) tools has gained popularity across diverse applications, including cancer treatment, algorithmic configuration, and chemical process optimization. In this domain, the combination of ML and OR often relies on representing the ML model output using Mixed Integer Programming (MIP) formulations. Numerous studies … Read more

Sharpness and well-conditioning of nonsmooth convex formulations in statistical signal recovery

\(\) We study a sample complexity vs. conditioning tradeoff in modern signal recovery problems where convex optimization problems are built from sampled observations. We begin by introducing a set of condition numbers related to sharpness in \(\ell_p\) or Schatten-p norms (\(p\in[1,2]\)) based on nonsmooth reformulations of a class of convex optimization problems, including sparse recovery, … Read more

Shattering Inequalities for Learning Optimal Decision Trees

Recently, mixed-integer programming (MIP) techniques have been applied to learn optimal decision trees. Empirical research has shown that optimal trees typically have better out-of-sample performance than heuristic approaches such as CART. However, the underlying MIP formulations often suffer from weak linear programming (LP) relaxations. Many existing MIP approaches employ big-M constraints to ensure observations are … Read more

Gain Confidence, Reduce Disappointment: A New Approach to Cross-Validation for Sparse Regression

Ridge regularized sparse linear regression involves selecting a subset of features that explains the relationship between a high-dimensional design matrix and an output vector in an interpretable manner. To select the sparsity and robustness of linear regressors, techniques like leave-one-out cross-validation are commonly used for hyperparameter tuning. However, cross-validation typically increases the cost of sparse … Read more

A new upper bound of the Euclidean TSP constant

Let X1, X2, . . . , Xn be n independent and uniformly distributed random points in a compact region R ⊂ R2 of area 1. Let TSP(X1, . . . , Xn) denote the length of the optimal Euclidean traveling salesman tour that traverses all these points. The classical Beardwood-Halton-Hammersley theorem proves the existence … Read more

Nonlinear Distributionally Robust Optimization

This article focuses on a class of distributionally robust optimization (DRO) problems where, unlike the growing body of the literature, the objective function is potentially non-linear in the distribution. Existing methods to optimize nonlinear functions in probability space use the Frechet derivatives, which present both theoretical and computational challenges. Motivated by this, we propose an … Read more

Finding Regions of Counterfactual Explanations via Robust Optimization

Counterfactual explanations play an important role in detecting bias and improving the explainability of data-driven classification models. A counterfactual explanation (CE) is a minimal perturbed data point for which the decision of the model changes. Most of the existing methods can only provide one CE, which may not be achievable for the user. In this … Read more

Maximum Likelihood Probability Measures over Sets and Applications to Data-Driven Optimization

\(\) Motivated by data-driven approaches to sequential decision-making under uncertainty, we study maximum likelihood estimation of a distribution over a general measurable space when, unlike traditional setups, realizations of the underlying uncertainty are not directly observable but instead are known to lie within observable sets. While extant work studied the special cases when the observed … Read more