Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Lojasiewicz Condition

In 1963, Polyak proposed a simple condition that is sufficient to show a global linear convergence rate for gradient descent. This condition is a special case of the Lojasiewicz inequality proposed in the same year, and it does not require strong convexity (or even convexity). In this work, we show that this much-older Polyak-Lojasiewicz (PL) … Read more

A Second-Order Method for Strongly Convex L1-Regularization Problems

In this paper a robust second-order method is developed for the solution of strongly convex l1-regularized problems. The main aim is to make the proposed method as inexpensive as possible, while even difficult problems can be efficiently solved. The proposed method is a primal-dual Newton Conjugate Gradients (pdNCG) method. Convergence properties of pdNCG are studied … Read more

Matrix-free Interior Point Method for Compressed Sensing Problems

We consider a class of optimization problems for sparse signal reconstruction which arise in the field of Compressed Sensing (CS). A plethora of approaches and solvers exist for such problems, for example GPSR, FPC AS, SPGL1, NestA, l1 ls, PDCO to mention a few. CS applications lead to very well conditioned optimization problems and therefore … Read more

A Family of Second-Order Methods for Convex L1-Regularized Optimization

This paper is concerned with the minimization of an objective that is the sum of a convex function $f$ and an $\ell_1$ regularization term. Our interest is in methods that incorporate second-order information about the function $f$ to accelerate convergence. We describe a semi-smooth Newton framework that can be used to generate a variety of … Read more

Sample Size Selection in Optimization Methods for Machine Learning

This paper presents a methodology for using varying sample sizes in batch-type optimization methods for large scale machine learning problems. The first part of the paper deals with the delicate issue of dynamic sample selection in the evaluation of the function and gradient. We propose a criterion for increasing the sample size based on variance … Read more

Iteration Complexity of Randomized Block-Coordinate Descent Methods for Minimizing a Composite Function

In this paper we develop a randomized block-coordinate descent method for minimizing the sum of a smooth and a simple nonsmooth block-separable convex function and prove that it obtains an $\epsilon$-accurate solution with probability at least $1-\rho$ in at most $O(\tfrac{n}{\epsilon} \log \tfrac{1}{\rho})$ iterations, where $n$ is the number of blocks. For strongly convex functions … Read more