Robust Regression over Averaged Uncertainty

We propose a new formulation of robust regression by integrating all realizations of the uncertainty set and taking an averaged approach to obtain the optimal solution for the ordinary least-squared regression problem. We show that this formulation surprisingly recovers ridge regression and establishes the missing link between robust optimization and the mean squared error approaches … Read more

Stochastic Discrete First-order Algorithm for Feature Subset Selection

This paper addresses the problem of selecting a significant subset of candidate features to use for multiple linear regression. Bertsimas et al. (2016) recently proposed the discrete first-order (DFO) algorithm to efficiently find near-optimal solutions to this problem. However, this algorithm is unable to escape from locally optimal solutions. To resolve this, we propose a … Read more

Subset selection in sparse matrices

In subset selection we search for the best linear predictor that involves a small subset of variables. From a computational complexity viewpoint, subset selection is NP-hard and few classes are known to be solvable in polynomial time. Using mainly tools from discrete geometry, we show that some sparsity conditions on the original data matrix allow … Read more

A mixed-integer fractional optimization approach to best subset selection

We consider the best subset selection problem in linear regression, i.e., finding a parsimonious subset of the regression variables that provides the best fit to the data according to some predefined criterion. We show that, for a broad range of criteria used in the statistics literature, the best subset selection problem can be modeled as … Read more

Best subset selection of factors affecting influenza spread using bi-objective optimization

A typical approach for computing an optimal strategy for non-pharmaceutical interventions during an influenza outbreak is based on statistical ANOVA. In this study, for the first time, we propose to use bi-objective mixed integer linear programming. Our approach employs an existing agent-based simulation model and statistical design of experiments presented in Martinez and Das (2014) … Read more

Best subset selection via bi-objective mixed integer linear programming

We study the problem of choosing the best subset of p features in linear regression given n observations. This problem naturally contains two objective functions including minimizing the amount of bias and minimizing the number of predictors. The existing approaches transform the problem into a single-objective optimization problem either by combining the two objectives using … Read more

Best subset selection for eliminating multicollinearity

This paper proposes a method for eliminating multicollinearity from linear regression models. Specifically, we select the best subset of explanatory variables subject to the upper bound on the condition number of the correlation matrix of selected variables. We first develop a cutting plane algorithm that, to approximate the condition number constraint, iteratively appends valid inequalities … Read more

A New Perspective on Boosting in Linear Regression via Subgradient Optimization and Relatives

In this paper we analyze boosting algorithms in linear regression from a new perspective: that of modern first-order methods in convex optimization. We show that classic boosting algorithms in linear regression, namely the incremental forward stagewise algorithm (FS-epsilon) and least squares boosting (LS-Boost-epsilon), can be viewed as subgradient descent to minimize the loss function defined … Read more