Convergence and Convergence Rate of Stochastic Gradient Search in the Case of Multiple and Non-Isolated Extrema

Published: 2009/07/07, Updated: 2009/07/17

Control Applications, Stochastic Programming convergence rate, lojasiewicz gradient inequality, machine learning, point-convergence, stochastic gradient search, system identification Short URL: https://optimization-online.org/?p=10792

The asymptotic behavior of stochastic gradient algorithms is studied. Relying on some results of differential geometry (Lojasiewicz gradient inequality), the almost sure point-convergence is demonstrated and relatively tight almost sure bounds on the convergence rate are derived. In sharp contrast to all existing result of this kind, the asymptotic results obtained here do not require the objective function (associated with the stochastic gradient search) to have an isolated minimum at which the Hessian of the objective function is strictly positive definite. Using the obtained results, the asymptotic behavior of recursive prediction error identification methods is analyzed. The convergence and convergence rate of supervised learning algorithms are also studied relying on these results.

Article

Download

View PDF