Stochastic Compositional Gradient Descent: Algorithms for Minimizing Compositions of Expected-Value Functions

Classical stochastic gradient methods are well suited for minimizing expected-value objective functions. However, they do not apply to the minimization of a nonlinear function involving expected values or a composition of two expected-value functions, i.e., problems of the form $\min_x \E_v\[f_v\big(\E_w [g_w(x)]\big) \]$. In order to solve this stochastic composition problem, we propose a class … Read more

Fast Algorithms for the Minimum Volume Estimator

The MVE estimator is an important tool in robust regression and outlier detection in statistics. We develop fast and efficient algorithms for the MVE estimator problem and discuss how they can be implemented efficiently. The novelty of our approach stems from the recent developments in the first-order algorithms for solving the related Minimum Volume Enclosing … Read more

A Branch-and-Bound Algorithm for Instrumental Variable Quantile Regression

This paper studies a statistical problem called instrumental variable quantile regres- sion (IVQR). We model IVQR as a convex quadratic program with complementarity constraints and—although this type of program is generally NP-hard—we develop a branch-and-bound algorithm to solve it globally. We also derive bounds on key vari- ables in the problem, which are valid asymptotically … Read more

Alternating direction method of multipliers for sparse zero-variance discriminant analysis and principal component analysis

We consider the task of classification in the high-dimensional setting where the number of features of the given data is significantly greater than the number of observations. To accomplish this task, we propose sparse zero-variance discriminant analysis (SZVD) as a method for simultaneouslyperforming linear discriminant analysis and feature selection on high-dimensional data. This method combines … Read more

Subset Selection by Mallows’ Cp: A Mixed Integer Programming Approach

This paper concerns a method of selecting the best subset of explanatory variables for a linear regression model. Employing Mallows’ C_p as a goodness-of-fit measure, we formulate the subset selection problem as a mixed integer quadratic programming problem. Computational results demonstrate that our method provides the best subset of variables in a few seconds when … Read more

Generalized Gauss Inequalities via Semidefinite Programming

A sharp upper bound on the probability of a random vector falling outside a polytope, based solely on the first and second moments of its distribution, can be computed efficiently using semidefinite programming. However, this Chebyshev-type bound tends to be overly conservative since it is determined by a discrete worst-case distribution. In this paper we … Read more

A First-Order Algorithm for the A-Optimal Experimental Design Problem: A Mathematical Programming Approach

We develop and analyse a first-order algorithm for the A-optimal experimental design problem. The problem is first presented as a special case of a parametric family of optimal design problems for which duality results and optimality conditions are given. Then, two first-order (Frank-Wolfe type) algorithms are presented, accompanied by a detailed time-complexity analysis of the … Read more

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent

The alternating direction method of multipliers (ADMM) is now widely used in many fields, and its convergence was proved when two blocks of variables are alternatively updated. It is strongly desirable and practically valuable to extend ADMM directly to the case of a multi-block convex minimization problem where its objective function is the sum of … Read more

Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization

We introduce a proximal version of the stochastic dual coordinate ascent method and show how to accelerate the method using an inner-outer iteration procedure. We analyze the runtime of the framework and obtain rates that improve state-of-the-art results for various key machine learning optimization problems including SVM, logistic regression, ridge regression, Lasso, and multiclass SVM. … Read more

Composite Self-concordant Minimization

We propose a variable metric framework for minimizing the sum of a self-concordant function and a possibly non-smooth convex function endowed with a computable proximal operator. We theoretically establish the convergence of our framework without relying on the usual Lipschitz gradient assumption on the smooth part. An important highlight of our work is a new … Read more