Block cubic Newton with greedy selection

A second-order block coordinate descent method is proposed for the unconstrained minimization of an objective function with Lipschitz continuous Hessian. At each iteration, a block of variables is selected by means of a greedy (Gauss-Southwell) rule which considers the amount of first-order stationarity violation, then an approximate minimizer of a cubic model is computed for … Read more

Riemannian trust-region methods for strict saddle functions with complexity guarantees

The difficulty of minimizing a nonconvex function is in part explained by the presence of saddle points. This slows down optimization algorithms and impacts worst-case complexity guarantees. However, many nonconvex problems of interest possess a favorable structure for optimization, in the sense that saddle points can be escaped efficiently by appropriate algorithms. This strict saddle … Read more

Nonlinear matrix recovery using optimization on the Grassmann manifold

We investigate the problem of recovering a partially observed high-rank matrix whose columns obey a nonlinear structure such as a union of subspaces, an algebraic variety or grouped in clusters. The recovery problem is formulated as the rank minimization of a nonlinear feature map applied to the original matrix, which is then further approximated by … Read more

Worst-case evaluation complexity and optimality of second-order methods for nonconvex smooth optimization

We establish or refute the optimality of inexact second-order methods for unconstrained nonconvex optimization from the point of view of worst-case evaluation complexity, improving and generalizing the results of Cartis, Gould and Toint (2010,2011). To this aim, we consider a new general class of inexact second-order algorithms for unconstrained optimization that includes regularization and trust-region … Read more

Oracle Complexity of Second-Order Methods for Smooth Convex Optimization

Second-order methods, which utilize gradients as well as Hessians to optimize a given function, are of major importance in mathematical optimization. In this work, we study the oracle complexity of such methods, or equivalently, the number of iterations required to optimize a function to a given accuracy. Focusing on smooth and convex functions, we derive … Read more

Optimization Methods for Large-Scale Machine Learning

This paper provides a review and commentary on the past, present, and future of numerical optimization algorithms in the context of machine learning applications. Through case studies on text classification and the training of deep neural networks, we discuss how optimization problems arise in machine learning and what makes them challenging. A major theme of … Read more

Iteration-complexity of a Rockafellar’s proximal method of multipliers for convex programming based on second-order approximations

This paper studies the iteration-complexity of a new primal-dual algorithm based on Rockafellar’s proximal method of multipliers (PMM) for solving smooth convex programming problems with inequality constraints. In each step, either a step of Rockafellar’s PMM for a second-order model of the problem is computed or a relaxed extragradient step is performed. The resulting algorithm … Read more

Newton-like method with diagonal correction for distributed optimization

We consider distributed optimization problems where networked nodes cooperatively minimize the sum of their locally known convex costs. A popular class of methods to solve these problems are the distributed gradient methods, which are attractive due to their inexpensive iterations, but have a drawback of slow convergence rates. This motivates the incorporation of second-order information … Read more

Performance of First- and Second-Order Methods for L1-Regularized Least Squares Problems

We study the performance of first- and second-order optimization methods for l1-regularized sparse least-squares problems as the conditioning and the dimensions of the problem increase up to one trillion. A rigorously defined generator is presented which allows control of the dimensions, the conditioning and the sparsity of the problem. The generator has very low memory … Read more

A Preconditioner for a Primal-Dual Newton Conjugate Gradients Method for Compressed Sensing Problems

In this paper we are concerned with the solution of Compressed Sensing (CS) problems where the signals to be recovered are sparse in coherent and redundant dictionaries. We extend a primal-dual Newton Conjugate Gradients (pdNCG) method for CS problems. We provide an inexpensive and provably effective preconditioning technique for linear systems using pdNCG. Numerical results … Read more