Some new accelerated and stochastic gradient descent algorithms based on locally Lipschitz gradient constants

In this paper, we revisit the recent stepsize applied for the gradient descent scheme which is called NGD proposed by [Hoai et al., A novel stepsize for gradient descent method, Operations Research Letters (2024) 53, doi: 10.1016/j.orl.2024.107072]. We first investigate NGD stepsize with two well-known accelerated techniques which are Heavy ball and Nesterov’s methods. In the convex setting of unconstrained nonlinear optimization problems, we show the ergodic convergence of the iterates obtained by accelerated versions of NGD with a sublinear rate. The stochastic versions of the proposed accelerated algorithms are introduced with analysis on the convergence in the nonconvex setting of the objective. Although our proposed algorithms require global Lipschitz continuity of the gradient, we do not utilize the global Lipschitz constant during computations. Instead, we leverage information about local Lipschitz constants derived from previous iterations. Numerical experiments on some practical problems in machine learning and deep learning problems demonstrate the efficiency of our proposed
methods compared with the existing ones.

Article

Download