Subgame Perfect Methods in Nonsmooth Convex Optimization

This paper considers nonsmooth convex optimization with either a subgradient or proximal operator oracle. In both settings, we identify algorithms that achieve the recently introduced game-theoretic optimality notion for algorithms known as subgame perfection. Subgame perfect algorithms meet a more stringent requirement than just minimax optimality. Not only must they provide optimal uniform guarantees on … Read more

A Practical Adaptive Subgame Perfect Gradient Method

We present a performant gradient method for smooth convex optimization, drawing inspiration from several recent advances in the field. Our algorithm, the Adaptive Subgame Perfect Gradient Method (ASPGM) is based on the notion of subgame perfection, attaining a dynamic strengthening of minimax optimality. At each iteration, ASPGM makes a momentum-type update, optimized dynamically based on … Read more

The Optimal Smoothings of Sublinear Functions and Convex Cones

This paper considers the problem of smoothing convex functions and sets, seeking the nearest smooth convex function or set to a given one. For convex cones and sublinear functions, a full characterization of the set of all optimal smoothings is given. These provide if and only if characterizations of the set of optimal smoothings for … Read more

A Universally Optimal Primal-Dual Method for Minimizing Heterogeneous Compositions

This paper proposes a universal, optimal algorithm for convex minimization problems of the composite form $g_0(x)+h(g_1(x),\dots, g_m(x)) + u(x)$. We allow each $g_j$ to independently range from being nonsmooth Lipschitz to smooth, from convex to strongly convex, described by notions of H\”older continuous gradients and uniform convexity. Note that, although the objective is built from … Read more

Some Unified Theory for Variance Reduced Prox-Linear Methods

This work considers the nonconvex, nonsmooth problem of minimizing a composite objective of the form $f(g(x))+h(x)$ where the inner mapping $g$ is a smooth finite summation or expectation amenable to variance reduction. In such settings, prox-linear methods can enjoy variance-reduced speed-ups despite the existence of nonsmoothness. We provide a unified convergence theory applicable to a … Read more

Performance Estimation for Smooth and Strongly Convex Sets

We extend recent computer-assisted design and analysis techniques for first-order optimization over structured functions–known as performance estimation–to apply to structured sets. We prove “interpolation theorems” for smooth and strongly convex sets with Slater points and bounded diameter, showing a wide range of extremal questions amount to structured mathematical programs. Prior function interpolation theorems are recovered … Read more

Scalable Projection-Free Optimization Methods via MultiRadial Duality Theory

Recent works have developed new projection-free first-order methods based on utilizing linesearches and normal vector computations to maintain feasibility. These oracles can be cheaper than orthogonal projection or linear optimization subroutines but have the drawback of requiring a known strictly feasible point to do these linesearches with respect to. In this work, we develop new … Read more

On Averaging and Extrapolation for Gradient Descent

This work considers the effect of averaging, and more generally extrapolation, of the iterates of gradient descent in smooth convex optimization. After running the method, rather than reporting the final iterate, one can report either a convex combination of the iterates (averaging) or a generic combination of the iterates (extrapolation). For several common stepsize sequences, … Read more

Some Primal-Dual Theory for Subgradient Methods for Strongly Convex Optimization

We consider (stochastic) subgradient methods for strongly convex but potentially nonsmooth non-Lipschitz optimization. We provide new equivalent dual descriptions (in the style of dual averaging) for the classic subgradient method, the proximal subgradient method, and the switching subgradient method. These equivalences enable $O(1/T)$ convergence guarantees in terms of both their classic primal gap and a … Read more

Goldstein Stationarity in Lipschitz Constrained Optimization

We prove the first convergence guarantees for a subgradient method minimizing a generic Lipschitz function over generic Lipschitz inequality constraints. No smoothness or convexity (or weak convexity) assumptions are made. Instead, we utilize a sequence of recent advances in Lipschitz unconstrained minimization, which showed convergence rates of $O(1/\delta\epsilon^3)$ towards reaching a “Goldstein” stationary point, that … Read more