Global Convergence of ADMM in Nonconvex Nonsmooth Optimization

In this paper, we analyze the convergence of the alternating direction method of multipliers (ADMM) for minimizing a nonconvex and possibly nonsmooth objective function, $\phi(x_1,\ldots,x_p,y)$, subject to linear equality constraints that couple $x_1,\ldots,x_p,y$, where $p\ge 1$ is an integer. Our ADMM sequentially updates the primal variables in the order $x_1,\ldots,x_p,y$, followed by updating the dual … Read more

ExtraPush for Convex Smooth Decentralized Optimization over Directed Networks

In this note, we extend the existing algorithms Extra and subgradient-push to a new algorithm ExtraPush for convex consensus optimization over a directed network. When the network is stationary, we propose a simplified algorithm called Normalized ExtraPush. These algorithms use a fixed step size like in Extra and accept the column-stochastic mixing matrices like in … Read more

ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates

We propose ARock, an asynchronous parallel algorithmic framework for finding a fixed point to a nonexpansive operator. In the framework, a set of agents (machines, processors, or cores) update a sequence of randomly selected coordinates of the unknown variable in an asynchronous parallel fashion. As special cases of ARock, novel algorithms for linear systems, convex … Read more

A Three-Operator Splitting Scheme and its Optimization Applications

Operator splitting schemes have been successfully used in computational sciences to reduce complex problems into a series of simpler subproblems. Since 1950s, these schemes have been widely used to solve problems in PDE and control. Recently, large-scale optimization problems in machine learning, signal processing, and imaging have created a resurgence of interest in operator-splitting based … Read more

Self Equivalence of the Alternating Direction Method of Multipliers

The alternating direction method of multipliers (ADM or ADMM) breaks a complex optimization problem into much simpler subproblems. The ADM algorithms are typically short and easy to implement yet exhibit (nearly) state-of-the-art performance for large-scale optimization problems. To apply ADM, we first formulate a given problem into the “ADM-ready” form, so the final algorithm depends … Read more

Block stochastic gradient iteration for convex and nonconvex optimization

The stochastic gradient (SG) method can minimize an objective function composed of a large number of differentiable functions, or solve a stochastic optimization problem, to a moderate accuracy. The block coordinate descent/update (BCD) method, on the other hand, handles problems with multiple blocks of variables by updating them one at a time; when the blocks … Read more

Faster convergence rates of relaxed Peaceman-Rachford and ADMM under regularity assumptions

Splitting schemes are a class of powerful algorithms that solve complicated monotone inclusion and convex optimization problems that are built from many simpler pieces. They give rise to algorithms in which the simple pieces of the decomposition are processed individually. This leads to easily implementable and highly parallelizable algorithms, which often obtain nearly state-of-the-art performance. … Read more

Convergence rate analysis of several splitting schemes

Splitting schemes are a class of powerful algorithms that solve complicated monotone inclusions and convex optimization problems that are built from many simpler pieces. They give rise to algorithms in which the simple pieces of the decomposition are processed individually. This leads to easily implementable and highly parallelizable algorithms, which often obtain nearly state-of-the-art performance. … Read more

Parallel Multi-Block ADMM with o(1/k) Convergence

This paper introduces a parallel and distributed extension to the alternating direction method of multipliers (ADMM). The algorithm decomposes the original problem into N smaller subproblems and solves them in parallel at each iteration. This Jacobian-type algorithm is well suited for distributed computing and is particularly attractive for solving certain large-scale problems. This paper introduces … Read more

On the Convergence of Decentralized Gradient Descent

Consider the consensus problem of minimizing $f(x)=\sum_{i=1}^n f_i(x)$ where each $f_i$ is only known to one individual agent $i$ out of a connected network of $n$ agents. All the agents shall collaboratively solve this problem and obtain the solution subject to data exchanges restricted to between neighboring agents. Such algorithms avoid the need of a … Read more