Radial Duality Part II: Applications and Algorithms

The first part of this work established the foundations of a radial duality between nonnegative optimization problems, inspired by the work of (Renegar, 2016). Here we utilize our radial duality theory to design and analyze projection-free optimization algorithms that operate by solving a radially dual problem. In particular, we consider radial subgradient, smoothing, and accelerated … Read more

The Landscape of the Proximal Point Method for Nonconvex-Nonconcave Minimax Optimization

Minimax optimization has become a central tool for modern machine learning with applications in generative adversarial networks, robust optimization, reinforcement learning, etc. These applications are often nonconvex-nonconcave, but the existing theory is unable to identify and deal with the fundamental difficulties posed by nonconvex-nonconcave structures. In this paper, we study the classic proximal point method … Read more

General Convergence Rates Follow From Specialized Rates Assuming Growth Bounds

Often in the analysis of first-order methods, assuming the existence of a quadratic growth bound (a generalization of strong convexity) facilitates much stronger convergence analysis. Hence the analysis is done twice, once for the general case and once for the growth bounded case. We give a meta-theorem for deriving general convergence rates from those assuming … Read more

A Simple Nearly-Optimal Restart Scheme For Speeding-Up First-Order Methods

We present a simple scheme for restarting first-order methods for convex optimization problems. Restarts are made based only on achieving specified decreases in objective values, the specified amounts being the same for all optimization problems. Unlike existing restart schemes, the scheme makes no attempt to learn parameter values characterizing the structure of an optimization problem, … Read more

Convergence Rates for Deterministic and Stochastic Subgradient Methods Without Lipschitz Continuity

We generalize the classic convergence rate theory for subgradient methods to apply to non-Lipschitz functions via a new measure of steepness. For the deterministic projected subgradient method, we derive a global $O(1/\sqrt{T})$ convergence rate for any function with at most exponential growth. Our approach implies generalizations of the standard convergence rates for gradient descent on … Read more

Radial Subgradient Descent

We present a subgradient method for minimizing non-smooth, non-Lipschitz convex optimization problems. The only structure assumed is that a strictly feasible point is known. We extend the work of Renegar [1] by taking a different perspective, leading to an algorithm which is conceptually more natural, has notably improved convergence rates, and for which the analysis … Read more