A Robust Multi-Batch L-BFGS Method for Machine Learning

This paper describes an implementation of the L-BFGS method designed to deal with two adversarial situations. The first occurs in distributed computing environments where some of the computational nodes devoted to the evaluation of the function and gradient are unable to return results on time. A similar challenge occurs in a multi-batch approach in which … Read more

Behavior of accelerated gradient methods near critical points of nonconvex functions

We examine the behavior of accelerated gradient methods in smooth nonconvex unconstrained optimization, focusing in particular on their behavior near strict saddle points. Accelerated methods are iterative methods that typically step along a direction that is a linear combination of the previous step and the gradient of the function evaluated at a point at or … Read more

Properties of the block BFGS update and its application to the limited-memory block BNS method for unconstrained minimization.

A block version of the BFGS variable metric update formula and its modifications are investigated. In spite of the fact that this formula satisfies the quasi-Newton conditions with all used difference vectors and that the improvement of convergence is the best one in some sense for quadratic objective functions, for general functions it does not … Read more

A Line-Search Algorithm Inspired by the Adaptive Cubic Regularization Framework and Complexity Analysis

Adaptive regularized framework using cubics has emerged as an alternative to line-search and trust-region algorithms for smooth nonconvex optimization, with an optimal complexity amongst second-order methods. In this paper, we propose and analyze the use of an iteration dependent scaled norm in the adaptive regularized framework using cubics. Within such scaled norm, the obtained method … Read more

Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization

There has been much recent interest in finding unconstrained local minima of smooth functions, due in part of the prevalence of such problems in machine learning and robust statistics. A particular focus is algorithms with good complexity guarantees. Second-order Newton-type methods that make use of regularization and trust regions have been analyzed from such a … Read more

Analyzing Random Permutations for Cyclic Coordinate Descent

We consider coordinate descent methods on convex quadratic problems, in which exact line searches are performed at each iteration. (This algorithm is identical to Gauss-Seidel on the equivalent symmetric positive definite linear system.) We describe a class of convex quadratic problems for which the random-permutations version of cyclic coordinate descent (RPCD) outperforms the standard cyclic … Read more

Convergence and Complexity Analysis of a Levenberg-Marquardt Algorithm for Inverse Problems

The Levenberg-Marquardt algorithm is one of the most popular algorithms for finding the solution of nonlinear least squares problems. Across different modified variations of the basic procedure, the algorithm enjoys global convergence, a competitive worst case iteration complexity rate, and a guaranteed rate of local convergence for both zero and nonzero small residual problems, under … Read more

On the use of the energy norm in trust-region and adaptive cubic regularization subproblems

We consider solving unconstrained optimization problems by means of two popular globalization techniques: trust-region (TR) algorithms and adaptive regularized framework using cubics (ARC). Both techniques require the solution of a so-called “subproblem” in which a trial step is computed by solving an optimization problem involving an approximation of the objective function, called “the model”. The … Read more

A decoupled first/second-order steps technique for nonconvex nonlinear unconstrained optimization with improved complexity bounds

In order to be provably convergent towards a second-order stationary point, optimization methods applied to nonconvex problems must necessarily exploit both first and second-order information. However, as revealed by recent complexity analyzes of some of these methods, the overall effort to reach second-order points is significantly larger when compared to the one of approaching first-order … Read more

Automatic Differentiation of the Open CASCADE Technology CAD System and its coupling with an Adjoint CFD Solver

Automatic Differentiation (AD) is applied to the open-source CAD system Open CASCADE Technology using the AD software tool ADOL-C (Automatic Differentiation by OverLoading in C++). The differentiated CAD system is coupled with a discrete adjoint CFD solver, thus providing the first example of a complete differentiated design chain built from generic, multi-purpose tools. The design … Read more