First-Order Objective-Function-Free Optimization Algorithms and Their Complexity

A class of algorithms for unconstrained nonconvex optimization is considered where the value of the objective function is never computed. The class contains a deterministic version of the first-order Adagrad method typically used for minimization of noisy function, but also allows the use of second-order information when available. The rate of convergence of methods in … Read more

Global Complexity Bound of a Proximal ADMM for Linearly-Constrained Nonseperable Nonconvex Composite Programming

This paper proposes and analyzes a dampened proximal alternating direction method of multipliers (DP.ADMM) for solving linearly-constrained nonconvex optimization problems where the smooth part of the objective function is nonseparable. Each iteration of DP.ADMM consists of: (ii) a sequence of partial proximal augmented Lagrangian (AL) updates, (ii) an under-relaxed Lagrange multiplier update, and (iii) a … Read more

Frank-Wolfe and friends: a journey into projection-free first-order optimization methods

Invented some 65 years ago in a seminal paper by Marguerite Straus-Frank and Philip Wolfe, the Frank-Wolfe method recently enjoys a remarkable revival, fuelled by the need of fast and reliable first-order optimization methods in Data Science and other relevant application areas. This review tries to explain the success of this approach by illustrating versatility … Read more

Practical Large-Scale Linear Programming using Primal-Dual Hybrid Gradient

We present PDLP, a practical first-order method for linear programming (LP) that can solve to the high levels of accuracy that are expected in traditional LP applications. In addition, it can scale to very large problems because its core operation is matrix-vector multiplications. PDLP is derived by applying the primal-dual hybrid gradient (PDHG) method, popularized … Read more

Average Curvature FISTA for Nonconvex Smooth Composite Optimization Problems

A previous authors’ paper introduces an accelerated composite gradient (ACG) variant, namely AC-ACG, for solving nonconvex smooth composite optimization (N-SCO) problems. In contrast to other ACG variants, AC-ACG estimates the local upper curvature of the N-SCO problem by using the average of the observed upper-Lipschitz curvatures obtained during the previous iterations, and uses this estimation … Read more

FrankWolfe.jl: a high-performance and flexible toolbox for Frank-Wolfe algorithms and Conditional Gradients

We present FrankWolfe.jl, an open-source implementation of several popular Frank-Wolfe and Conditional Gradients variants for first-order constrained optimization. The package is designed with flexibility and high-performance in mind, allowing for easy extension and relying on few assumptions regarding the user-provided functions. It supports Julia’s unique multiple dispatch feature, and interfaces smoothly with generic linear optimization … Read more

Universal Conditional Gradient Sliding for Convex Optimization

In this paper, we present a first-order projection-free method, namely, the universal conditional gradient sliding (UCGS) method, for solving ε-approximate solutions to convex differentiable optimization problems. For objective functions with Hölder continuous gradients, we show that UCGS is able to terminate with ε-solutions with at most O((1/ε)^(2/(1+3v))) gradient evaluations and O((1/ε)^(4/(1+3v))) linear objective optimizations, where … Read more

Infeasibility detection with primal-dual hybrid gradient for large-scale linear programming

We study the problem of detecting infeasibility of large-scale linear programming problems using the primal-dual hybrid gradient method (PDHG) of Chambolle and Pock (2011). The literature on PDHG has mostly focused on settings where the problem at hand is assumed to be feasible. When the problem is not feasible, the iterates of the algorithm do … Read more

A unifying framework for the analysis of projection-free first-order methods under a sufficient slope condition

The analysis of projection-free first order methods is often complicated by the presence of different kinds of “good” and “bad” steps. In this article, we propose a unifying framework for projection-free methods, aiming to simplify the converge analysis by getting rid of such a distinction between steps. The main tool employed in our framework is … Read more

Sparse Regression at Scale: Branch-and-Bound rooted in First-Order Optimization

We consider the least squares regression problem, penalized with a combination of the L0 and L2 norms (a.k.a. L0 L2 regularization). Recent work presents strong evidence that the resulting L0-based estimators can outperform popular sparse learning methods, under many important high-dimensional settings. However, exact computation of L0-based estimators remains a major challenge. Indeed, state-of-the-art mixed … Read more