Complexity of quadratic penalty methods with adaptive accuracy under a PL condition for the constraints

We study the quadratic penalty method (QPM) for smooth nonconvex optimization problems with equality constraints. Assuming the constraint violation satisfies the PL condition near the feasible set, we derive sharper worst-case complexity bounds for obtaining approximate first-order KKT points. When the objective and constraints are twice continuously differentiable, we show that QPM equipped with a … Read more

A Fast Newton Method Under Local Lipschitz Smoothness

A new, fast second-order method is proposed that achieves the optimal \(\mathcal{O}\left(|\log(\epsilon)|\epsilon^{-3/2}\right) \) complexity to obtain first-order $\epsilon$-stationary points. Crucially, this is deduced without assuming the standard global Lipschitz Hessian continuity condition, but onlyusing an appropriate local smoothness requirement. The algorithm exploits Hessian information to compute a Newton step and a negative curvature step when … Read more

Block cubic Newton with greedy selection

A second-order block coordinate descent method is proposed for the unconstrained minimization of an objective function with a Lipschitz continuous Hessian. At each iteration, a block of variables is selected by means of a greedy (Gauss-Southwell) rule which considers the amount of first-order stationarity violation, then an approximate minimizer of a cubic model is computed … Read more

Riemannian trust-region methods for strict saddle functions with complexity guarantees

The difficulty of minimizing a nonconvex function is in part explained by the presence of saddle points. This slows down optimization algorithms and impacts worst-case complexity guarantees. However, many nonconvex problems of interest possess a favorable structure for optimization, in the sense that saddle points can be escaped efficiently by appropriate algorithms. This strict saddle … Read more

Nonlinear matrix recovery using optimization on the Grassmann manifold

We investigate the problem of recovering a partially observed high-rank matrix whose columns obey a nonlinear structure such as a union of subspaces, an algebraic variety or grouped in clusters. The recovery problem is formulated as the rank minimization of a nonlinear feature map applied to the original matrix, which is then further approximated by … Read more

Worst-case evaluation complexity and optimality of second-order methods for nonconvex smooth optimization

We establish or refute the optimality of inexact second-order methods for unconstrained nonconvex optimization from the point of view of worst-case evaluation complexity, improving and generalizing the results of Cartis, Gould and Toint (2010,2011). To this aim, we consider a new general class of inexact second-order algorithms for unconstrained optimization that includes regularization and trust-region … Read more

Oracle Complexity of Second-Order Methods for Smooth Convex Optimization

Second-order methods, which utilize gradients as well as Hessians to optimize a given function, are of major importance in mathematical optimization. In this work, we study the oracle complexity of such methods, or equivalently, the number of iterations required to optimize a function to a given accuracy. Focusing on smooth and convex functions, we derive … Read more

Optimization Methods for Large-Scale Machine Learning

This paper provides a review and commentary on the past, present, and future of numerical optimization algorithms in the context of machine learning applications. Through case studies on text classification and the training of deep neural networks, we discuss how optimization problems arise in machine learning and what makes them challenging. A major theme of … Read more

Iteration-complexity of a Rockafellar’s proximal method of multipliers for convex programming based on second-order approximations

This paper studies the iteration-complexity of a new primal-dual algorithm based on Rockafellar’s proximal method of multipliers (PMM) for solving smooth convex programming problems with inequality constraints. In each step, either a step of Rockafellar’s PMM for a second-order model of the problem is computed or a relaxed extragradient step is performed. The resulting algorithm … Read more

Newton-like method with diagonal correction for distributed optimization

We consider distributed optimization problems where networked nodes cooperatively minimize the sum of their locally known convex costs. A popular class of methods to solve these problems are the distributed gradient methods, which are attractive due to their inexpensive iterations, but have a drawback of slow convergence rates. This motivates the incorporation of second-order information … Read more