Gradient Descent only Converges to Minimizers
We show that gradient descent converges to a local minimizer, almost surely with random initialization. This is proved by applying the Stable Manifold Theorem from dynamical systems theory. ArticleDownload View PDF