Shattering Inequalities for Learning Optimal Decision Trees

Recently, mixed-integer programming (MIP) techniques have been applied to learn optimal decision trees. Empirical research has shown that optimal trees typically have better out-of-sample performance than heuristic approaches such as CART. However, the underlying MIP formulations often suffer from weak linear programming (LP) relaxations. Many existing MIP approaches employ big-M constraints to ensure observations are … Read more

Optimal Cross-Validation for Sparse Linear Regression

Given a high-dimensional covariate matrix and a response vector, ridge-regularized sparse linear regression selects a subset of features that explains the relationship between covariates and the response in an interpretable manner. To select the sparsity and robustness of linear regressors, techniques like k-fold cross-validation are commonly used for hyperparameter tuning. However, cross-validation substantially increases the … Read more

A new upper bound of the Euclidean TSP constant

Let X1, X2, . . . , Xn be n independent and uniformly distributed random points in a compact region R ⊂ R2 of area 1. Let TSP(X1, . . . , Xn) denote the length of the optimal Euclidean traveling salesman tour that traverses all these points. The classical Beardwood-Halton-Hammersley theorem proves the existence … Read more

Nonlinear Distributionally Robust Optimization

This article focuses on a class of distributionally robust optimization (DRO) problems where, unlike the growing body of the literature, the objective function is potentially non-linear in the distribution. Existing methods to optimize nonlinear functions in probability space use the Frechet derivatives, which present both theoretical and computational challenges. Motivated by this, we propose an … Read more

Finding Regions of Counterfactual Explanations via Robust Optimization

Counterfactual explanations play an important role in detecting bias and improving the explainability of data-driven classification models. A counterfactual explanation (CE) is a minimal perturbed data point for which the decision of the model changes. Most of the existing methods can only provide one CE, which may not be achievable for the user. In this … Read more

Maximum Likelihood Probability Measures over Sets and Applications to Data-Driven Optimization

Motivated by data-driven approaches to sequential decision-making under uncertainty, we study maximum likelihood estimation of a distribution over a general measurable space when, unlike traditional setups, realizations of the underlying uncertainty are not directly observable but instead are known to lie within observable sets. While extant work studied the special cases when the observed sets … Read more

Optimized Dimensionality Reduction for Moment-based Distributionally Robust Optimization

Moment-based distributionally robust optimization (DRO) provides an optimization framework to integrate statistical information with traditional optimization approaches. Under this framework, one assumes that the underlying joint distribution of random parameters runs in a distributional ambiguity set constructed by moment information and makes decisions against the worst-case distribution within the set. Although most moment-based DRO problems … Read more

When Deep Learning Meets Polyhedral Theory: A Survey

In the past decade, deep learning became the prevalent methodology for predictive modeling thanks to the remarkable accuracy of deep neural networks in tasks such as computer vision and natural language processing. Meanwhile, the structure of neural networks converged back to simpler representations based on piecewise constant and piecewise linear functions such as the Rectified … Read more

A Stochastic-Gradient-based Interior-Point Algorithm for Solving Smooth Bound-Constrained Optimization Problems

A stochastic-gradient-based interior-point algorithm for minimizing a continuously differentiable objective function (that may be nonconvex) subject to bound constraints is presented, analyzed, and demonstrated through experimental results. The algorithm is unique from other interior-point methods for solving smooth (nonconvex) optimization problems since the search directions are computed using stochastic gradient estimates. It is also unique … Read more