A Stochastic Objective-Function-Free Adaptive Regularization Method with Optimal Complexity

\(\) A fully stochastic second-order adaptive-regularization method for unconstrained nonconvex optimization is presented which never computes the objective-function value, but yet achieves the optimal $\mathcal{O}(\epsilon^{-3/2})$ complexity bound for finding first-order critical points. The method is noise-tolerant and the inexactness conditions required for convergence depend on the history of past steps. Applications to cases where derivative … Read more

Complexity of Adagrad and other first-order methods for nonconvex optimization problems with bounds constraints

A parametric class of trust-region algorithms for constrained nonconvex optimization is analyzed, where the objective function is never computed. By defining appropriate first-order stationarity criteria, we are able to extend the Adagrad method to the newly considered problem and retrieve the standard complexity rate of the projected gradient method that uses both the gradient and … Read more

Yet another fast variant of Newton’s method for nonconvex optimization

\(\) A second-order algorithm is proposed for minimizing smooth nonconvex functions that alternates between regularized Newton and negative curvature steps. In most cases, the Hessian matrix is regularized with the square root of the current gradient and an additional term taking moderate negative curvature into account, a negative curvature step being taken only exceptionnally. As … Read more

Convergence properties of an Objective-Function-Free Optimization regularization algorithm, including an $\mathcal{O}(\epsilon^{-3/2})$ complexity bound

An adaptive regularization algorithm for unconstrained nonconvex optimization is presented in which the objective function is never evaluated, but only derivatives are used. This algorithm belongs to the class of adaptive regularization methods, for which optimal worst-case complexity results are known for the standard framework where the objective function is evaluated. It is shown in … Read more

Parametric complexity analysis for a class of first-order Adagrad-like algorithms

A class of algorithms for optimization in the presence of noise is presented, that does not require the evaluation of the objective function. This class generalizes the well-known Adagrad method. The complexity of this class is then analyzed as a function of its parameters, and it is shown that some methods of the class enjoy … Read more

First-Order Objective-Function-Free Optimization Algorithms and Their Complexity

A class of algorithms for unconstrained nonconvex optimization is considered where the value of the objective function is never computed. The class contains a deterministic version of the first-order Adagrad method typically used for minimization of noisy function, but also allows the use of second-order information when available. The rate of convergence of methods in … Read more

Hölder Gradient Descent and Adaptive Regularization Methods in Banach Spaces for First-Order Points

This paper considers optimization of smooth nonconvex functionals in smooth infinite dimensional spaces. A Hölder gradient descent algorithm is first proposed for finding approximate first-order points of regularized polynomial functionals. This method is then applied to analyze the evaluation complexity of an adaptive regularization method which searches for approximate first-order points of functionals with $\beta$-H\”older … Read more