A Parameter-Free Restart Scheme with Only a Parallelizable $\log\log(1/\epsilon)$ Overhead

It is well-known that first-order methods can offer accelerated convergence rates in the presence of growth structures. Restarting schemes provide a general tool for such speed-ups. These schemes typically either require unrealistic problem knowledge, incur logarithmic overhead factors in oracle complexity, and/or have a nontrivial initial burn-in phase. We present a parameter-free approach for restarting … Read more

Accelerated proximal gradient algorithm for weakly convex function

In this work, we investigate the accelerated proximal gradient algorithm (APGα) for weakly convex composite optimization problems. Building upon the framework of B¨ohm and Wright, and additionally assuming that f is convex and coercive while g is bounded below, we establish an objective residual convergence rate of O(1⁄j²) for α≥3. Moreover, when α›3, this rate … Read more

Non-smooth stochastic gradient descent using smoothing functions

In this paper, we address stochastic optimization problems involving a composition of a non-smooth outer function and a smooth inner function, a formulation frequently encountered in machine learning and operations research. To deal with the non-differentiability of the outer function, we approximate the original non-smooth function using smoothing functions, which are continuously differentiable and approach … Read more

Scalable Projection-Free Optimization Methods via MultiRadial Duality Theory

Recent works have developed new projection-free first-order methods based on utilizing linesearches and normal vector computations to maintain feasibility. These oracles can be cheaper than orthogonal projection or linear optimization subroutines but have the drawback of requiring a known strictly feasible point to do these linesearches with respect to. In this work, we develop new … Read more

Fast convergence of inertial primal-dual dynamics and algorithms for a bilinearly coupled saddle point problem

This paper is devoted to study the convergence rates of a second-order dynamical system and its corresponding discretization associated with a continuously differentiable bilinearly coupled convex-concave saddle point problem. First, we consider the second-order dynamical system with asymptotically vanishing damping term and show the existence and uniqueness of the trajectories as global twice continuously differentiable … Read more

Provably Faster Gradient Descent via Long Steps

This work establishes provably faster convergence rates for gradient descent in smooth convex optimization via a computer-assisted analysis technique. Our theory allows nonconstant stepsize policies with frequent long steps potentially violating descent by analyzing the overall effect of many iterations at once rather than the typical one-iteration inductions used in most first-order method analyses. We … Read more

Affine invariant convergence rates of the conditional gradient method

We show that the conditional gradient method for the convex composite problem \[\min_x\{f(x) + \Psi(x)\}\] generates primal and dual iterates with a duality gap converging to zero provided a suitable growth property holds and the algorithm makes a judicious choice of stepsizes. The rate of convergence of the duality gap to zero ranges from sublinear … Read more

A Stochastic Bregman Primal-Dual Splitting Algorithm for Composite Optimization

We study a stochastic first order primal-dual method for solving convex-concave saddle point problems over real reflexive Banach spaces using Bregman divergences and relative smoothness assumptions, in which we allow for stochastic error in the computation of gradient terms within the algorithm. We show ergodic convergence in expectation of the Lagrangian optimality gap with a … Read more