On the steplength selection in gradient methods for unconstrained optimization

The seminal paper by Barzilai and Borwein [IMA J. Numer. Anal. 8 (1988)] has given rise to an extensive investigation aimed at developing effective gradient methods, able to deal with large-scale optimization problems. Several steplength rules have been first designed for unconstrained quadratic problems and then extended to general nonlinear problems; these rules share the common idea of attempting to capture, in an inexpensive way, some second-order information. Our aim is to investigate the relationship between the steplengths of some gradient methods and the spectrum of the Hessian of the objective function, in order to provide insight into the computational effectiveness of these methods. We start the analysis in the framework of strongly convex quadratic problems, where the role of the eigenvalues of the Hessian matrix in the behaviour of gradient methods is better understood. Then we move to general unconstrained problems, focusing on natural extensions of some steplength rules analysed in the previous case. Our study suggests that, in the quadratic case, the methods that tend to use groups of small steplengths followed by some large steplengths, attempting to approximate the inverses of some eigenvalues of the Hessian matrix, exhibit better numerical behaviour. The methods considered in the general case seem to preserve the behaviour of their quadratic counterparts, in the sense that they appear to follow somehow the spectrum of the Hessian of the objective function during their progress toward a stationary point.

Article

Download

View PDF