Alternating direction method of multipliers for sparse zero-variance discriminant analysis and principal component analysis

We consider the task of classification in the high-dimensional setting where the number of features of the given data is significantly greater than the number of observations. To accomplish this task, we propose sparse zero-variance discriminant analysis (SZVD) as a method for simultaneouslyperforming linear discriminant analysis and feature selection on high-dimensional data. This method combines … Read more

Subset Selection by Mallows’ Cp: A Mixed Integer Programming Approach

This paper concerns a method of selecting the best subset of explanatory variables for a linear regression model. Employing Mallows’ C_p as a goodness-of-fit measure, we formulate the subset selection problem as a mixed integer quadratic programming problem. Computational results demonstrate that our method provides the best subset of variables in a few seconds when … Read more

Generalized Gauss Inequalities via Semidefinite Programming

A sharp upper bound on the probability of a random vector falling outside a polytope, based solely on the first and second moments of its distribution, can be computed efficiently using semidefinite programming. However, this Chebyshev-type bound tends to be overly conservative since it is determined by a discrete worst-case distribution. In this paper we … Read more

A First-Order Algorithm for the A-Optimal Experimental Design Problem: A Mathematical Programming Approach

We develop and analyse a first-order algorithm for the A-optimal experimental design problem. The problem is first presented as a special case of a parametric family of optimal design problems for which duality results and optimality conditions are given. Then, two first-order (Frank-Wolfe type) algorithms are presented, accompanied by a detailed time-complexity analysis of the … Read more

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent

The alternating direction method of multipliers (ADMM) is now widely used in many fields, and its convergence was proved when two blocks of variables are alternatively updated. It is strongly desirable and practically valuable to extend ADMM directly to the case of a multi-block convex minimization problem where its objective function is the sum of … Read more

Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization

We introduce a proximal version of the stochastic dual coordinate ascent method and show how to accelerate the method using an inner-outer iteration procedure. We analyze the runtime of the framework and obtain rates that improve state-of-the-art results for various key machine learning optimization problems including SVM, logistic regression, ridge regression, Lasso, and multiclass SVM. … Read more

Composite Self-concordant Minimization

We propose a variable metric framework for minimizing the sum of a self-concordant function and a possibly non-smooth convex function endowed with a computable proximal operator. We theoretically establish the convergence of our framework without relying on the usual Lipschitz gradient assumption on the smooth part. An important highlight of our work is a new … Read more

Sample Average Approximation Method for Compound Stochastic Optimization Problems

The paper studies stochastic optimization (programming) problems with compound functions containing expectations and extreme values of other random functions as arguments. Compound functions arise in various applications. A typical example is a variance function of nonlinear outcomes. Other examples include stochastic minimax problems, econometric models with latent variables, and multilevel and multicriteria stochastic optimization problems. … Read more

Mixed Integer Second-Order Cone Programming Formulations for Variable Selection

This paper concerns the method of selecting the best subset of explanatory variables in a multiple linear regression model. To evaluate a subset regression model, some goodness-of-fit measures, e.g., adjusted R^2, AIC and BIC, are generally employed. Although variable selection is usually handled via a stepwise regression method, the method does not always provide the … Read more

One condition for all: solution uniqueness and robustness of l1-synthesis and l1-analysis minimizations

The l1-synthesis and l1-analysis models recover structured signals from their undersampled measurements. The solution of the former model is often a sparse sum of dictionary atoms, and that of the latter model often makes sparse correlations with dictionary atoms. This paper addresses the question: when can we trust these models to recover specific signals? We … Read more