Range of the displacement operator of PDHG with applications to quadratic and conic programming

Primal-dual hybrid gradient (PDHG) is a first-order method for saddle-point problems and convex programming introduced by Chambolle and Pock. Recently, Applegate et al. analyzed the behavior of PDHG when applied to an infeasible or unbounded instance of linear programming, and in particular, showed that PDHG is able to diagnose these conditions. Their analysis hinges on … Read more

MGProx: A nonsmooth multigrid proximal gradient method with adaptive restriction for strongly convex optimization

We study the combination of proximal gradient descent with multigrid for solving a class of possibly nonsmooth strongly convex optimization problems. We propose a multigrid proximal gradient method called MGProx, which accelerates the proximal gradient method by multigrid, based on utilizing hierarchical information of the optimization problem. MGProx applies a newly introduced adaptive restriction operator … Read more

Computational complexity of decomposing a symmetric matrix as a sum of positive semidefinite and diagonal matrices

We study several variants of decomposing a symmetric matrix into a sum of a low-rank positive semidefinite matrix and a diagonal matrix. Such decompositions have applications in factor analysis and they have been studied for many decades. On the one hand, we prove that when the rank of the positive semidefinite matrix in the decomposition … Read more

Nonlinear conjugate gradient for smooth convex functions

The method of nonlinear conjugate gradients (NCG) is widely used in practice for unconstrained optimization, but it satisfies weak complexity bounds at best when applied to smooth convex functions. In contrast, Nesterov’s accelerated gradient (AG) method is optimal up to constant factors for this class. However, when specialized to quadratic function, conjugate gradient is optimal … Read more

On identifying clusters from sum-of-norms clustering computation

Sum-of-norms clustering is a clustering formulation based on convex optimization that automatically induces hierarchy. Multiple algorithms have been proposed to solve the optimization problem: subgradient descent by Hocking et al.\ \cite{hocking}, ADMM and ADA by Chi and Lange\ \cite{Chi}, stochastic incremental algorithm by Panahi et al.\ \cite{Panahi} and semismooth Newton-CG augmented Lagrangian method by Yuan … Read more

Provable Overlapping Community Detection in Weighted Graphs

Community detection is a widely-studied unsupervised learning problem in which the task is to group similar entities together based on observed pairwise entity interactions. This problem has applications in diverse domains such is social network analysis and computational biology. There is a significant amount of literature studying this problem under the assumption that the communities … Read more

A termination criterion for stochastic gradient descent for binary classification

We propose a new, simple, and computationally inexpensive termination test for constant step-size stochastic gradient descent (SGD) applied to binary classification on the logistic and hinge loss with homogeneous linear predictors. Our theoretical results support the effectiveness of our stopping criterion when the data is Gaussian distributed. This presence of noise allows for the possibility … Read more

Potential-based analyses of first-order methods for constrained and composite optimization

We propose potential-based analyses for first-order algorithms applied to constrained and composite minimization problems. We first propose “idealized” frameworks for algorithms in the strongly and non-strongly convex cases and argue based on a potential that methods following the framework achieve the best possible rate. Then we show that the geometric descent (GD) algorithm by Bubeck … Read more

Recovery of a mixture of Gaussians by sum-of-norms clustering

Sum-of-norms clustering is a method for assigning $n$ points in $\R^d$ to $K$ clusters, $1\le K\le n$, using convex optimization. Recently, Panahi et al.\ proved that sum-of-norms clustering is guaranteed to recover a mixture of Gaussians under the restriction that the number of samples is not too large. The purpose of this note is to … Read more

A single potential governing convergence of conjugate gradient, accelerated gradient and geometric descent

Nesterov’s accelerated gradient (AG) method for minimizing a smooth strongly convex function $f$ is known to reduce $f({\bf x}_k)-f({\bf x}^*)$ by a factor of $\epsilon\in(0,1)$ after $k=O(\sqrt{L/\ell}\log(1/\epsilon))$ iterations, where $\ell,L$ are the two parameters of smooth strong convexity. Furthermore, it is known that this is the best possible complexity in the function-gradient oracle model of … Read more