Stabilized Barzilai-Borwein method

The Barzilai-Borwein (BB) method is a popular and efficient tool for solving large-scale unconstrained optimization problems. Its search direction is the same as for the steepest descent (Cauchy) method, but its stepsize rule is different. Owing to this, it converges much faster than the Cauchy method. A feature of the BB method is that it … Read more

Adaptive cubic regularization methods with dynamic inexact Hessian information and applications to finite-sum minimization

Abstract. We consider the Adaptive Regularization with Cubics approach for solving nonconvex optimization problems and propose a new variant based on inexact Hessian information chosen dynamically. The theoretical analysis of the proposed procedure is given. The key property of ARC framework, constituted by optimal worst-case function/derivative evaluation bounds for first- and second-order critical point, is … Read more

Numerical solution of generalized minimax problems

This contribution contains the description and investigation of four numerical methods for solving generalized minimax problems, which consists in the minimization of functions which are compositions of special smooth convex functions with maxima of smooth functions (the most important problem of this type is the sum of maxima of smooth functions). Section~1 is introductory. In … Read more

Hybrid methods for nonlinear least squares problems

This contribution contains a description and analysis of effective methods for minimization of the nonlinear least squares function $F(x) = (1/2) f^T(x) f(x)$, where $x \in R^n$ and $f \in R^m$, together with extensive computational tests and comparisons of the introduced methods. All hybrid methods are described in detail and their global convergence is proved … Read more

A Theoretical and Empirical Comparison of Gradient Approximations in Derivative-Free Optimization

In this paper, we analyze several methods for approximating gradients of noisy functions using only function values. These methods include finite differences, linear interpolation, Gaussian smoothing and smoothing on a unit sphere. The methods differ in the number of functions sampled, the choice of the sample points, and the way in which the gradient approximations … Read more

A Class of Stochastic Variance Reduced Methods with an Adaptive Stepsize

Stochastic variance reduced methods have recently surged into prominence for solving large scale optimization problems in the context of machine learning. Tan, Ma and Dai et al. first proposed the new stochastic variance reduced gradient (SVRG) method with the Barzilai-Borwein (BB) method to compute step sizes automatically, which performs well in practice. On this basis, … Read more

On monotonic estimates of the norm of the minimizers of regularized quadratic functions in Krylov spaces

We show that the minimizers of regularized quadratic functions restricted to their natural Krylov spaces increase in Euclidean norm as the spaces expand. CitationTechnical Report RAL-TR-2019-005, STFC-Rutherford Appleton Laboratory, Oxfordshire, England, April 5th 2019ArticleDownload View PDF

A Delayed Weighted Gradient Method for Strictly Convex Quadratic Minimization

This paper develops an accelerated version of the steepest descent method by a two-step iteration. The new algorithm uses information with delay to define the iterations. Specifically, in the first step, a prediction of the new test point is calculated by using the gradient method with the exact minimal gradient steplength and then, a correction … Read more

Planning for Dynamics under Uncertainty

Planning under uncertainty is a frequently encountered problem. Noisy observation is a typical situation that introduces uncertainty. Such a problem can be formulated as a Partially Observable Markov Decision Process (POMDP). However, solving a POMDP is nontrivial and can be computationally expensive in continuous state, action, observation and latent state space. Through this work, we … Read more

Limited-Memory BFGS with Displacement Aggregation

A displacement aggregation strategy is proposed for the curvature pairs stored in a limited-memory BFGS (a.k.a. L-BFGS) method such that the resulting (inverse) Hessian approximations are equal to those that would be derived from a full-memory BFGS method. This means that, if a sufficiently large number of pairs are stored, then an optimization algorithm employing … Read more