On Averaging and Extrapolation for Gradient Descent

This work considers the effect of averaging, and more generally extrapolation, of the iterates of gradient descent in smooth convex optimization. After running the method, rather than reporting the final iterate, one can report either a convex combination of the iterates (averaging) or a generic combination of the iterates (extrapolation). For several common stepsize sequences, … Read more

Expected Value of Matrix Quadratic Forms with Wishart distributed Random Matrices

To explore the limits of a stochastic gradient method, it may be useful to consider an example consisting of an infinite number of quadratic functions. In this context, it is appropriate to determine the expected value and the covariance matrix of the stochastic noise, i.e. the difference of the true gradient and the approximated gradient … Read more

Approximate Primal Solutions and Rate Analysis in Dual Subgradient Methods

We study primal solutions obtained as a by-product of subgradient methods when solving the Lagrangian dual of a primal convex constrained optimization problem (possibly nonsmooth). The existing literature on the use of subgradient methods for generating primal optimal solutions is limited to the methods producing such solutions only asymptotically (i.e., in the limit as the … Read more