Trust-Region Algorithms for Training Responses: Machine Learning Methods Using Indefinite Hessian Approximations

Machine learning (ML) problems are often posed as highly nonlinear and nonconvex unconstrained optimization problems. Methods for solving ML problems based on stochastic gradient descent are easily scaled for very large problems but may involve fine-tuning many hyper-parameters. Quasi-Newton approaches based on the limited-memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) update typically do not require manually tuning hyper-parameters but … Read more

A Self-Correcting Variable-Metric Algorithm Framework for Nonsmooth Optimization

An algorithm framework is proposed for minimizing nonsmooth functions. The framework is variable-metric in that, in each iteration, a step is computed using a symmetric positive definite matrix whose value is updated as in a quasi-Newton scheme. However, unlike previously proposed variable-metric algorithms for minimizing nonsmooth functions, the framework exploits self-correcting properties made possible through … Read more

Block BFGS Methods

We introduce a quasi-Newton method with block updates called Block BFGS. We show that this method, performed with inexact Armijo-Wolfe line searches, converges globally and superlinearly under the same convexity assumptions as BFGS. We also show that Block BFGS is globally convergent to a stationary point when applied to non-convex functions with bounded Hessian, and … Read more

Stochastic Quasi-Newton Methods for Nonconvex Stochastic Optimization

In this paper we study stochastic quasi-Newton methods for nonconvex stochastic optimization, where we assume that noisy information about the gradients of the objective function is available via a stochastic first-order oracle ($\SFO$). We propose a general framework for such methods, for which we prove almost sure convergence to stationary points and analyze its worst-case … Read more

Improved Damped Quasi-Newton Methods for Unconstrained Optimization

Recently, Al-Baali (2014) has extended the damped-technique in the modified BFGS method of Powell (1978) for Lagrange constrained optimization functions to the Broyden family of quasi-Newton methods for unconstrained optimization. Appropriate choices for the damped-parameter, which maintain the global and superlinear con- vergence property of these methods on convex functions and correct the Hessian approximations … Read more

Generalized Uniformly Optimal Methods for Nonlinear Programming

In this paper, we present a generic framework to extend existing uniformly optimal convex programming algorithms to solve more general nonlinear, possibly nonconvex, optimization problems. The basic idea is to incorporate a local search step (gradient descent or Quasi-Newton iteration) into these uniformly optimal convex programming methods, and then enforce a monotone decreasing property of … Read more

On the equivalence of the method of conjugate gradients and quasi-Newton methods on quadratic problems

In this paper we state necessary and sufficient conditions for equivalence of the method of conjugate gradients and quasi-Newton methods on a quadratic problem. We show that the set of quasi-Newton schemes that generate parallel search directions to those of the method of conjugate gradients is strictly larger than the one-parameter Broyden family. In addition, … Read more

Stochastic Quasi-Newton Methods for Nonconvex Stochastic Optimization

In this paper we study stochastic quasi-Newton methods for nonconvex stochastic optimization, where we assume that only stochastic information of the gradients of the objective function is available via a stochastic first-order oracle (SFO). Firstly, we propose a general framework of stochastic quasi-Newton methods for solving nonconvex stochastic optimization. The proposed framework extends the classic … Read more

A Quasi-Newton Algorithm for Nonconvex, Nonsmooth Optimization with Global Convergence Guarantees

A line search algorithm for minimizing nonconvex and/or nonsmooth objective functions is presented. The algorithm is a hybrid between a standard Broyden–Fletcher–Goldfarb–Shanno (BFGS) and an adaptive gradient sampling (GS) method. The BFGS strategy is employed because it typically yields fast convergence to the vicinity of a stationary point, and together with the adaptive GS strategy … Read more

A Stochastic Quasi-Newton Method for Large-Scale Optimization

Abstract The question of how to incorporate curvature information in stochastic approximation methods is challenging. The direct application of classical quasi- Newton updating techniques for deterministic optimization leads to noisy curvature estimates that have harmful effects on the robustness of the iteration. In this paper, we propose a stochastic quasi-Newton method that is efficient, robust … Read more