Fast Stochastic Second-Order Adagrad for Nonconvex Bound-Constrained Optimization

ADAGB2, a generalization of the Adagrad algorithm for stochastic optimization is introduced, which is also applicable to bound-constrained problems and capable of using second-order information when available. It is shown that, givenĀ  delta in (0,1) and epsilon in (0,1], the ADAGB2 algorithm needs at most O(epsilon^{-2}) iterations to ensure an epsilon-approximate first-order critical point of … Read more

A Fast Newton Method Under Local Lipschitz Smoothness

A new, fast second-order method is proposed that achieves the optimal \(\mathcal{O}\left(|\log(\epsilon)|\epsilon^{-3/2}\right) \) complexity to obtain first-order $\epsilon$-stationary points. Crucially, this is deduced without assuming the standard global Lipschitz Hessian continuity condition, but onlyusing an appropriate local smoothness requirement. The algorithm exploits Hessian information to compute a Newton step and a negative curvature step when … Read more

Refining asymptotic complexity bounds for nonconvex optimization methods, including why steepest descent is o(eps^{-2}) rather than O(eps^{-2})

We revisit the standard “telescoping sum” argument ubiquitous in the final steps of analyzing evaluation complexity of algorithms for smooth nonconvex optimization, and obtain a refined formulation of the resulting bound as a function of the requested accuracy eps. While bounds obtained using the standard argument typically are of the form \(O(\epsilon^{-\alpha})\) for some positive … Read more

A Stochastic Objective-Function-Free Adaptive Regularization Method with Optimal Complexity

A fully stochastic second-order adaptive-regularization method for unconstrained nonconvex optimization is presented which never computes the objective-function value, but yet achieves the optimal $\mathcal{O}(\epsilon^{-3/2})$ complexity bound for finding first-order critical points. The method is noise-tolerant and the inexactness conditions required for convergence depend on the history of past steps. Applications to cases where derivative evaluation … Read more

S2MPJ and CUTEst optimization problems for Matlab, Python and Julia

A new decoder for the SIF test problems of the \cutest\ collection is described, which produces problem files allowing the computation of values and derivatives of the objective function and constraints of most \cutest\ problems directly within “native” Matlab, Python or Julia, without any additional installation or interfacing with MEX files or Fortran programs. When … Read more

Complexity of Adagrad and other first-order methods for nonconvex optimization problems with bounds constraints

A parametric class of trust-region algorithms for constrained nonconvex optimization is analyzed, where the objective function is never computed. By defining appropriate first-order stationarity criteria, we are able to extend the Adagrad method to the newly considered problem and retrieve the standard complexity rate of the projected gradient method that uses both the gradient and … Read more

Yet another fast variant of Newton’s method for nonconvex optimization

A second-order algorithm is proposed for minimizing smooth nonconvex functions that alternates between regularized Newton and negative curvature steps. In most cases, the Hessian matrix is regularized with the square root of the current gradient and an additional term taking moderate negative curvature into account, a negative curvature step being taken only exceptionnally. As a … Read more

Multilevel Objective-Function-Free Optimization with an Application to Neural Networks Training

A class of multi-level algorithms for unconstrained nonlinear optimization is presented which does not require the evaluation of the objective function. The class contains the momentum-less AdaGrad method as a particular (single-level) instance. The choice of avoiding the evaluation of the objective function is intended to make the algorithms of the class less sensitive to … Read more

Convergence properties of an Objective-Function-Free Optimization regularization algorithm, including an $\mathcal{O}(\epsilon^{-3/2})$ complexity bound

An adaptive regularization algorithm for unconstrained nonconvex optimization is presented in which the objective function is never evaluated, but only derivatives are used. This algorithm belongs to the class of adaptive regularization methods, for which optimal worst-case complexity results are known for the standard framework where the objective function is evaluated. It is shown in … Read more