Stochastic Variance-Reduced Prox-Linear Algorithms for Nonconvex Composite Optimization

We consider minimization of composite functions of the form $f(g(x))+h(x)$, where $f$ and $h$ are convex functions (which can be nonsmooth) and $g$ is a smooth vector mapping. In addition, we assume that $g$ is the average of finite number of component mappings or the expectation over a family of random component mappings. We propose … Read more

Accelerated Bregman Proximal Gradient Methods for Relatively Smooth Convex Optimization

We consider the problem of minimizing the sum of two convex functions: one is differentiable and relatively smooth with respect to a reference convex function, and the other can be nondifferentiable but simple to optimize. The relatively smooth condition is much weaker than the standard assumption of uniform Lipschitz continuity of the gradients, thus significantly … Read more

On self-concordant barriers for generalized power cones

In the study of interior-point methods for nonsymmetric conic optimization and their applications, Nesterov introduced the power cone, together with a 4-self-concordant barrier for it. In his PhD thesis, Chares found an improved 3-self-concordant barrier for the power cone. In addition, he introduced the generalized power cone, and conjectured a nearly optimal self-concordant barrier for … Read more

DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization

Machine learning with big data often involves large optimization models. For distributed optimization over a cluster of machines, frequent communication and synchronization of all model parameters (optimization variables) can be very costly. A promising solution is to use parameter servers to store different subsets of the model parameters, and update them asynchronously at different machines … Read more

Communication-Efficient Distributed Optimization of Self-Concordant Empirical Loss

We consider distributed convex optimization problems originated from sample average approximation of stochastic optimization, or empirical risk minimization in machine learning. We assume that each machine in the distributed computing system has access to a local empirical loss function, constructed with i.i.d. data sampled from a common distribution. We propose a communication-efficient distributed algorithm to … Read more

Stochastic Primal-Dual Coordinate Method for Regularized Empirical Risk Minimization

We consider a generic convex optimization problem associated with regularized empirical risk minimization of linear predictors. The problem structure allows us to reformulate it as a convex-concave saddle point problem. We propose a stochastic primal-dual coordinate (SPDC) method, which alternates between maximizing over a randomly chosen dual variable and minimizing over the primal variable. An … Read more

An Accelerated Proximal Coordinate Gradient Method and its Application to Regularized Empirical Risk Minimization

We consider the problem of minimizing the sum of two convex functions: one is smooth and given by a gradient oracle, and the other is separable over blocks of coordinates and has a simple known structure over each block. We develop an accelerated randomized proximal coordinate gradient (APCG) method for minimizing such convex composite functions. … Read more

A Proximal Stochastic Gradient Method with Progressive Variance Reduction

We consider the problem of minimizing the sum of two convex functions: one is the average of a large number of smooth component functions, and the other is a general convex function that admits a simple proximal mapping. We assume the whole objective function is strongly convex. Such problems often arise in machine learning, known … Read more

Randomized Block Coordinate Non-Monotone Gradient Method for a Class of Nonlinear Programming

In this paper we propose a randomized block coordinate non-monotone gradient (RBCNMG) method for minimizing the sum of a smooth (possibly nonconvex) function and a block-separable (possibly nonconvex nonsmooth) function. At each iteration, this method randomly picks a block according to any prescribed probability distribution and typically solves several associated proximal subproblems that usually have … Read more

On the Complexity Analysis of Randomized Block-Coordinate Descent Methods

In this paper we analyze the randomized block-coordinate descent (RBCD) methods for minimizing the sum of a smooth convex function and a block-separable convex function. In particular, we extend Nesterov’s technique (SIOPT 2012) for analyzing the RBCD method for minimizing a smooth convex function over a block-separable closed convex set to the aforementioned more general … Read more