Potential-based analyses of first-order methods for constrained and composite optimization

We propose potential-based analyses for first-order algorithms applied to constrained and composite minimization problems. We first propose “idealized” frameworks for algorithms in the strongly and non-strongly convex cases and argue based on a potential that methods following the framework achieve the best possible rate. Then we show that the geometric descent (GD) algorithm by Bubeck … Read more

Recovery of a mixture of Gaussians by sum-of-norms clustering

Sum-of-norms clustering is a method for assigning $n$ points in $\R^d$ to $K$ clusters, $1\le K\le n$, using convex optimization. Recently, Panahi et al.\ proved that sum-of-norms clustering is guaranteed to recover a mixture of Gaussians under the restriction that the number of samples is not too large. The purpose of this note is to … Read more

A single potential governing convergence of conjugate gradient, accelerated gradient and geometric descent

Nesterov’s accelerated gradient (AG) method for minimizing a smooth strongly convex function $f$ is known to reduce $f({\bf x}_k)-f({\bf x}^*)$ by a factor of $\epsilon\in(0,1)$ after $k=O(\sqrt{L/\ell}\log(1/\epsilon))$ iterations, where $\ell,L$ are the two parameters of smooth strong convexity. Furthermore, it is known that this is the best possible complexity in the function-gradient oracle model of … Read more

A unified convergence bound for conjugate gradient and accelerated gradient

Nesterov’s accelerated gradient method for minimizing a smooth strongly convex function $f$ is known to reduce $f(\x_k)-f(\x^*)$ by a factor of $\eps\in(0,1)$ after $k\ge O(\sqrt{L/\ell}\log(1/\eps))$ iterations, where $\ell,L$ are the two parameters of smooth strong convexity. Furthermore, it is known that this is the best possible complexity in the function-gradient oracle model of computation. The … Read more

Finding the largest low-rank clusters with Ky Fan 2-k-norm and l1-norm

We propose a convex optimization formulation with the Ky Fan 2-k-norm and l1-norm to fi nd k largest approximately rank-one submatrix blocks of a given nonnegative matrix that has low-rank block diagonal structure with noise. We analyze low-rank and sparsity structures of the optimal solutions using properties of these two matrix norms. We show that, under … Read more

Extreme point inequalities and geometry of the rank sparsity ball

We investigate geometric features of the unit ball corresponding to the sum of the nuclear norm of a matrix and the l_1 norm of its entries — a common penalty function encouraging joint low rank and high sparsity. As a byproduct of this effort, we develop a calculus (or algebra) of faces for general convex … Read more

Semidefinite Programming Based Preconditioning for More Robust Near-Separable Nonnegative Matrix Factorization

Nonnegative matrix factorization (NMF) under the separability assumption can provably be solved efficiently, even in the presence of noise, and has been shown to be a powerful technique in document classification and hyperspectral unmixing. This problem is referred to as near-separable NMF and requires that there exists a cone spanned by a small subset of … Read more

Some notes on applying computational divided differencing in optimization

We consider the problem of accurate computation of the finite difference $f(\x+\s)-f(\x)$ when $\Vert\s\Vert$ is very small. Direct evaluation of this difference in floating point arithmetic succumbs to cancellation error and yields 0 when $\s$ is sufficiently small. Nonetheless, accurate computation of this finite difference is required by many optimization algorithms for a “sufficient decrease” … Read more

Convex relaxation for finding planted influential nodes in a social network

We consider the problem of maximizing influence in a social network. We focus on the case that the social network is a directed bipartite graph whose arcs join senders to receivers. We consider both the case of deterministic networks and probabilistic graphical models, that is, the so-called “cascade” model. The problem is to find the … Read more

Fast and Robust Recursive Algorithms for Separable Nonnegative Matrix Factorization

In this paper, we study the nonnegative matrix factorization problem under the separability assumption (that is, there exists a cone spanned by a small subset of the columns of the input nonnegative data matrix containing all columns), which is equivalent to the hyperspectral unmixing problem under the linear mixing model and the pure-pixel assumption. We … Read more