An acceleration procedure for optimal first-order methods

We introduce in this paper an optimal first-order method that allows an easy and cheap evaluation of the local Lipschitz constant of the objective’s gradient. This constant must ideally be chosen at every iteration as small as possible, while serving in an indispensable upper bound for the value of the objective function. In the previously … Read more

A randomized Mirror-Prox method for solving structured large-scale matrix saddle-point problems

In this paper, we derive a randomized version of the Mirror-Prox method for solving some structured matrix saddle-point problems, such as the maximal eigenvalue minimization problem. Deterministic first-order schemes, such as Nesterov’s Smoothing Techniques or standard Mirror-Prox methods, require the exact computation of a matrix exponential at every iteration, limiting the size of the problems … Read more

Hedge algorithm and Dual Averaging schemes

We show that the Hedge algorithm, a method that is widely used in Machine Learning, can be interpreted as a particular instance of Dual Averaging schemes, which have recently been introduced by Nesterov for regret minimization. Based on this interpretation, we establish three alternative methods of the Hedge algorithm: one in the form of the … Read more

Hedge Algorithm and Subgradient Methods

We show that the Hedge Algorithm, a method that is widely used in Machine Learning, can be interpreted as a particular subgradient algorithm for minimizing a well-chosen convex function, namely as a Mirror Descent Scheme. Using this reformulation, we establish three modificitations and extensions of the Hedge Algorithm that are better or at least as … Read more

Smoothing techniques for solving semidefinite programs with many constraints

We use smoothing techniques to solve approximately mildly structured semidefinite programs with many constraints. As smoothing techniques require a specific problem format, we introduce an alternative problem formulation that fulfills the structural assumptions. The resulting algorithm has a complexity that depends linearly both on the number of constraints and on the inverse of the accuracy. … Read more