Preconditioned subgradient method for composite optimization: overparameterization and fast convergence

Composite optimization problems involve minimizing the composition of a smooth map with a convex function. Such objectives arise in numerous data science and signal processing applications, including phase retrieval, blind deconvolution, and collaborative filtering. The subgradient method achieves local linear convergence when the composite loss is well-conditioned. However, if the smooth map is, in a … Read more

Lyapunov-based Analysis on First Order Method for Composite Strong-Weak Convex Functions

The Nesterov’s accelerated gradient (NAG) method generalizes the classical gradient descent algorithm by improving the convergence rate from $\mathcal{O}\left(\frac{1}{t}\right)$ to $\mathcal{O}\left(\frac{1}{t^2}\right)$ in convex optimization. This study examines the proximal gradient framework for additively separable composite functions with smooth and non-smooth components. We demonstrate that Nesterov’s accelerated proximal gradient (NAPG$_\alpha$) method attains a convergence rate of … Read more

Inexact subgradient algorithm with a non-asymptotic convergence guarantee for copositive programming problems

In this paper, we propose a subgradient algorithm with a non-asymptotic convergence guarantee to solve copositive programming problems. The subproblem to be solved at each iteration is a standard quadratic programming problem, which is NP-hard in general. However, the proposed algorithm allows this subproblem to be solved inexactly. For a prescribed accuracy $\epsilon > 0$ … Read more

A Practical Adaptive Subgame Perfect Gradient Method

We present a performant gradient method for smooth convex optimization, drawing inspiration from several recent advances in the field. Our algorithm, the Adaptive Subgame Perfect Gradient Method (ASPGM) is based on the notion of subgame perfection, attaining a dynamic strengthening of minimax optimality. At each iteration, ASPGM makes a momentum-type update, optimized dynamically based on … Read more

Convexification of a Separable Function over a Polyhedral Ground Set

In this paper, we study the set \(\mathcal{S}^\kappa = \{ (x,y)\in\mathcal{G}\times\mathbb{R}^n : y_j = x_j^\kappa , j=1,\dots,n\}\), where \(\kappa > 1\) and the ground set \(\mathcal{G}\) is a nonempty polytope contained in \( [0,1]^n\). This nonconvex set is closely related to separable standard quadratic programming and appears as a substructure in potential-based network flow problems … Read more

Optimization in Theory and Practice

Algorithms for continuous optimization problems have a rich history of design and innovation over the past several decades, in which mathematical analysis of their convergence and complexity properties plays a central role. Besides their theoretical properties, optimization algorithms are interesting also for their practical usefulness as computational tools for solving real-world problems. There are often … Read more

A Simple Adaptive Proximal Gradient Method for Nonconvex Optimization

Consider composite nonconvex optimization problems where the objective function consists of a smooth nonconvex term (with Lipschitz-continuous gradient) and a convex (possibly nonsmooth) term. Existing parameter-free methods for such problems often rely on complex multi-loop structures, require line searches, or depend on restrictive assumptions (e.g., bounded iterates). To address these limitations, we introduce a novel … Read more

Moment-sos and spectral hierarchies for polynomial optimization on the sphere and quantum de Finetti theorems

We revisit the convergence analysis of two approximation hierarchies for polynomial optimization on the unit sphere. The first one is based on the moment-sos approach and gives semidefinite bounds for which Fang and Fawzi (2021) showed an analysis in \(O(1/r^2)\) for the r-th level bound, using the polynomial kernel method. The second hierarchy was recently … Read more

A Riemannian AdaGrad-Norm Method

We propose a manifold AdaGrad-Norm method (\textsc{MAdaGrad}), which extends the norm version of AdaGrad (AdaGrad-Norm) to Riemannian optimization. In contrast to line-search schemes, which may require several exponential map computations per iteration, \textsc{MAdaGrad} requires only one. Assuming the objective function $f$ has Lipschitz continuous Riemannian gradient, we show that the method requires at most $\mathcal{O}(\varepsilon^{-2})$ … Read more