On Optimal Universal First-Order Methods for Minimizing Heterogeneous Sums

This work considers minimizing a convex sum of functions, each with potentially different structure ranging from nonsmooth to smooth, Lipschitz to non-Lipschitz. Nesterov’s universal fast gradient method provides an optimal black-box first-order method for minimizing a single function that takes advantage of any continuity structure present without requiring prior knowledge. In this paper, we show … Read more

Optimal Convergence Rates for the Proximal Bundle Method

We study convergence rates of the classic proximal bundle method for a variety of nonsmooth convex optimization problems. We show that, without any modification, this algorithm adapts to converge faster in the presence of smoothness or a Hölder growth condition. Our analysis reveals that with a constant stepsize, the bundle method is adaptive, yet it … Read more

Radial Duality Part I: Foundations

(Renegar, 2016) introduced a novel approach to transforming generic conic optimization problems into unconstrained, uniformly Lipschitz continuous minimization. We introduce radial transformations generalizing these ideas, equipped with an entirely new motivation and development that avoids any reliance on convex cones or functions. Perhaps of greatest practical importance, this facilitates the development of new families of … Read more

Radial Duality Part II: Applications and Algorithms

The first part of this work established the foundations of a radial duality between nonnegative optimization problems, inspired by the work of (Renegar, 2016). Here we utilize our radial duality theory to design and analyze projection-free optimization algorithms that operate by solving a radially dual problem. In particular, we consider radial subgradient, smoothing, and accelerated … Read more

The Landscape of the Proximal Point Method for Nonconvex-Nonconcave Minimax Optimization

Minimax optimization has become a central tool for modern machine learning with applications in generative adversarial networks, robust optimization, reinforcement learning, etc. These applications are often nonconvex-nonconcave, but the existing theory is unable to identify and deal with the fundamental difficulties posed by nonconvex-nonconcave structures. In this paper, we study the classic proximal point method … Read more

General Convergence Rates Follow From Specialized Rates Assuming Growth Bounds

Often in the analysis of first-order methods, assuming the existence of a quadratic growth bound (a generalization of strong convexity) facilitates much stronger convergence analysis. Hence the analysis is done twice, once for the general case and once for the growth bounded case. We give a meta-theorem for deriving general convergence rates from those assuming … Read more

A Simple Nearly-Optimal Restart Scheme For Speeding-Up First-Order Methods

We present a simple scheme for restarting first-order methods for convex optimization problems. Restarts are made based only on achieving specified decreases in objective values, the specified amounts being the same for all optimization problems. Unlike existing restart schemes, the scheme makes no attempt to learn parameter values characterizing the structure of an optimization problem, … Read more

Convergence Rates for Deterministic and Stochastic Subgradient Methods Without Lipschitz Continuity

We generalize the classic convergence rate theory for subgradient methods to apply to non-Lipschitz functions via a new measure of steepness. For the deterministic projected subgradient method, we derive a global $O(1/\sqrt{T})$ convergence rate for any function with at most exponential growth. Our approach implies generalizations of the standard convergence rates for gradient descent on … Read more

Radial Subgradient Descent

We present a subgradient method for minimizing non-smooth, non-Lipschitz convex optimization problems. The only structure assumed is that a strictly feasible point is known. We extend the work of Renegar [1] by taking a different perspective, leading to an algorithm which is conceptually more natural, has notably improved convergence rates, and for which the analysis … Read more