A Proximal Stochastic Gradient Method with Progressive Variance Reduction

We consider the problem of minimizing the sum of two convex functions: one is the average of a large number of smooth component functions, and the other is a general convex function that admits a simple proximal mapping. We assume the whole objective function is strongly convex. Such problems often arise in machine learning, known … Read more

Alternating Proximal Gradient Method for Convex Minimization

In this paper, we propose an alternating proximal gradient method that solves convex minimization problems with three or more separable blocks in the objective function. Our method is based on the framework of alternating direction method of multipliers. The main computational effort in each iteration of the proposed method is to compute the proximal mappings … Read more

A Block Coordinate Descent Method for Regularized Multi-Convex Optimization with Applications to Nonnegative Tensor Factorization and Completion

This paper considers regularized block multi-convex optimization, where the feasible set and objective function are generally non-convex but convex in each block of variables. We review some of its interesting examples and propose a generalized block coordinate descent method. (Using proximal updates, we further allow non-convexity over some blocks.) Under certain conditions, we show that … Read more

A Proximal-Gradient Homotopy Method for the Sparse Least-Squares Problem

We consider solving the $\ell_1$-regularized least-squares ($\ell_1$-LS) problem in the context of sparse recovery, for applications such as compressed sensing. The standard proximal gradient method, also known as iterative soft-thresholding when applied to this problem, has low computational cost per iteration but a rather slow convergence rate. Nevertheless, when the solution is sparse, it often … Read more

Trace Norm Regularization: Reformulations, Algorithms, and Multi-task Learning

We consider a recently proposed optimization formulation of multi-task learning based on trace norm regularized least squares. While this problem may be formulated as a semidefinite program (SDP), its size is beyond general SDP solvers. Previous solution approaches apply proximal gradient methods to solve the primal problem. We derive new primal and dual reformulations of … Read more