Best subset selection via bi-objective mixed integer linear programming

We study the problem of choosing the best subset of p features in linear regression given n observations. This problem naturally contains two objective functions including minimizing the amount of bias and minimizing the number of predictors. The existing approaches transform the problem into a single-objective optimization problem either by combining the two objectives using … Read more

Structural Properties of Affine Sparsity Constraints

We introduce a new constraint system for sparse variable selection in statistical learning. Such a system arises when there are logical conditions on the sparsity of certain unknown model parameters that need to be incorporated into their selection process. Formally, extending a cardinality constraint, an affine sparsity constraint (ASC) is defined by a linear inequality … Read more

On Procrustes matching of non-negative matrices and an application to random tomography

We consider a Procrustes matching problem for non-negative matrices that arose in random tomography. As an alternative to the Frobenius distance, we propose an alternative non-symmetric distance using generalized inverses. Among its advantages is that it leads to a relatively simple quadratic function that can be optimized with least-square methods on manifolds. CitationAccepted for publication … Read more

Learning Enabled Optimization: Towards a Fusion of Statistical Learning and Stochastic Optimization

Several emerging applications, such as “Analytics of Things” and “Integrative Analytics” call for a fusion of statistical learning (SL) and stochastic optimization (SO). The Learning Enabled Optimization paradigm fuses concepts from these disciplines in a manner which not only enriches both SL and SO, but also provides a framework which supports rapid model updates and … Read more

D-OPTIMAL DESIGN FOR MULTIVARIATE POLYNOMIAL REGRESSION VIA THE CHRISTOFFEL FUNCTION AND SEMIDEFINITE RELAXATIONS

We present a new approach to the design of D-optimal experiments with multivariate polynomial regressions on compact semi-algebraic design spaces. We apply the moment-sum-of-squares hierarchy of semidefinite programming problems to solve numerically and approximately the optimal design problem. The geometry of the design is recovered with semidefinite programming duality theory and the Christoffel polynomial. ArticleDownload … Read more

General parameterized proximal point algorithm with applications in the statistical learning

In the literature, there are a few researches for the proximal point algorithm (PPA) with some parameters in the proximal matrix, especially for the multi-objective optimization problems. Introducing some parameters to the PPA will make it more attractive and flexible. By using the unified framework of the classical PPA and constructing a parameterized proximal matrix, … Read more

Faster Estimation of High-Dimensional Vine Copulas with Automatic Differentiation

Vine copula is an important tool in modeling dependence structures of continuous-valued random variables. The maximum likelihood estimation (MLE) for vine copulas has long been considered computationally difficult in higher dimensions, even in 10 or 20 dimensions. Current computational practice, including the implementation in the state-of- the-art R package VineCopula, suffers from the bottleneck of … Read more

A Parameterized Proximal Point Algorithm for Separable Convex Optimization

In this paper, we develop a Parameterized Proximal Point Algorithm (P-PPA) for solving a class of separable convex programming problems subject to linear and convex constraints. The proposed algorithm is provable to be globally convergent with a worst-case $O(1/t)$ convergence rate, where $t$ is the iteration number. By properly choosing the algorithm parameters, numerical experiments … Read more

Fast approximate solution of large dense linear programs

We show how random projections can be used to solve large-scale dense linear programs approximately. This is a new application of techniques which are now fairly well known in probabilistic algorithms, but have never yet been systematically applied to the fundamental class of Linear Programming. We develop the necessary theoretical framework, and show that this … Read more

Mixed Integer Quadratic Optimization Formulations for Eliminating Multicollinearity Based on Variance Inflation Factor

The variance inflation factor, VIF, is the most frequently used indicator for detecting multicollinearity in multiple linear regression models. This paper proposes two mixed integer quadratic optimization formulations for selecting the best subset of explanatory variables under upper-bound constraints on VIF of selected variables. Computational results illustrate the effectiveness of our optimization formulations based on … Read more