Non-asymptotic superlinear convergence of Nesterov accelerated BFGS

In this paper, we derive the explicit finite time local convergence of Nesterov accelerated the Broyden-Fletcher-Goldfarb-Shanno (NA-BFGS) under the assumption that the objective function is strongly convex, its gradient is Lipschitz continuous, and its Hessian is Lipschitz continuous at the optimal point. We have shown that the rate of convergence of the NA-BFGS method is … Read more

Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron

Modern machine learning focuses on highly expressive models that are able to fit or interpolate the data completely, resulting in zero training loss. For such models, we show that the stochastic gradients of common loss functions satisfy a strong growth condition. Under this condition, we prove that constant step-size stochastic gradient descent (SGD) with Nesterov … Read more

Gradient methods for convex minimization: better rates under weaker conditions

The convergence behavior of gradient methods for minimizing convex differentiable functions is one of the core questions in convex optimization. This paper shows that their well-known complexities can be achieved under conditions weaker than the commonly assumed ones. We relax the common gradient Lipschitz-continuity condition and strong convexity condition to ones that hold only over … Read more