Compact Representations of Structured BFGS Matrices

For general large-scale optimization problems compact representations exist in which recursive quasi-Newton update formulas are represented as compact matrix factorizations. For problems in which the objective function contains additional structure, so-called structured quasi-Newton methods exploit available second-derivative information and approximate unavailable second derivatives. This article develops the compact representations of two structured Broyden-Fletcher-Goldfarb-Shanno update formulas. … Read more

A Distributed Quasi-Newton Algorithm for Primal and Dual Regularized Empirical Risk Minimization

We propose a communication- and computation-efficient distributed optimization algorithm using second-order information for solving empirical risk minimization (ERM) problems with a nonsmooth regularization term. Our algorithm is applicable to both the primal and the dual ERM problem. Current second-order and quasi-Newton methods for this problem either do not work well in the distributed setting or … Read more

Limited-Memory BFGS with Displacement Aggregation

A displacement aggregation strategy is proposed for the curvature pairs stored in a limited-memory BFGS (a.k.a. L-BFGS) method such that the resulting (inverse) Hessian approximations are equal to those that would be derived from a full-memory BFGS method. This means that, if a sufficiently large number of pairs are stored, then an optimization algorithm employing … Read more

Quasi-Newton Methods for Deep Learning: Forget the Past, Just Sample

We present two sampled quasi-Newton methods: sampled LBFGS and sampled LSR1. Contrary to the classical variants of these methods that sequentially build (inverse) Hessian approximations as the optimization progresses, our proposed methods sample points randomly around the current iterate to produce these approximations. As a result, the approximations constructed make use of more reliable (recent … Read more

On limited-memory quasi-Newton methods for minimizing a quadratic function

The main focus in this paper is exact linesearch methods for minimizing a quadratic function whose Hessian is positive definite. We give two classes of limited-memory quasi-Newton Hessian approximations that generate search directions parallel to those of the method of preconditioned conjugate gradients, and hence give finite termination on quadratic optimization problems. The Hessian approximations … Read more

Quasi-Newton approaches to Interior Point Methods for quadratic problems

Interior Point Methods (IPM) rely on the Newton method for solving systems of nonlinear equations. Solving the linear systems which arise from this approach is the most computationally expensive task of an interior point iteration. If, due to problem’s inner structure, there are special techniques for efficiently solving linear systems, IPMs enjoy fast convergence and … Read more

A Distributed Quasi-Newton Algorithm for Empirical Risk Minimization with Nonsmooth Regularization

We propose a communication- and computation-efficient distributed optimization algorithm using second-order information for solving ERM problems with a nonsmooth regularization term. Current second-order and quasi-Newton methods for this problem either do not work well in the distributed setting or work only for specific regularizers. Our algorithm uses successive quadratic approximations, and we describe how to … Read more

On Quasi-Newton Forward–Backward Splitting: Proximal Calculus and Convergence

We introduce a framework for quasi-Newton forward–backward splitting algorithms (proximal quasi-Newton methods) with a metric induced by diagonal +/- rank-r symmetric positive definite matrices. This special type of metric allows for a highly efficient evaluation of the proximal mapping. The key to this efficiency is a general proximal calculus in the new metric. By using … Read more

Multipoint secant and interpolation methods with nonmonotone line search for solving systems of nonlinear equations

Multipoint secant and interpolation methods are effective tools for solving systems of nonlinear equations. They use quasi-Newton updates for approximating the Jacobian matrix. Owing to their ability to more completely utilize the information about the Jacobian matrix gathered at the previous iterations, these methods are especially efficient in the case of expensive functions. They are … Read more

Trust-Region Algorithms for Training Responses: Machine Learning Methods Using Indefinite Hessian Approximations

Machine learning (ML) problems are often posed as highly nonlinear and nonconvex unconstrained optimization problems. Methods for solving ML problems based on stochastic gradient descent are easily scaled for very large problems but may involve fine-tuning many hyper-parameters. Quasi-Newton approaches based on the limited-memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) update typically do not require manually tuning hyper-parameters but … Read more