A Parameter-Free Restart Scheme with Only a Parallelizable $\log\log(1/\epsilon)$ Overhead

It is well-known that first-order methods can offer accelerated convergence rates in the presence of growth structures. Restarting schemes provide a general tool for such speed-ups. These schemes typically either require unrealistic problem knowledge, incur logarithmic overhead factors in oracle complexity, and/or have a nontrivial initial burn-in phase. We present a parameter-free approach for restarting … Read more

Disjunctive Sum of Squares

We introduce the concept of disjunctive sum of squares for certifying nonnegativity of polynomials. Unlike the popular sum of squares approach where nonnegativity is certified by a single algebraic identity, the disjunctive sum of squares approach certifies nonnegativity with multiple algebraic identities which can be found in parallel. Our main result is a disjunctive Positivstellensatz … Read more

Log-Averaged Mirror Prox for Fast, Large-Scale Optimal Transport in Linear Space

We propose Log-Averaged Mirror Prox (LAMP), a linear-space primal-dual method for large-scale optimal transport. LAMP implements primal mirror prox updates by tracking an averaged dual sequence, reducing storage complexity from \({O}(nm)\) to \({O}(n+m)\) while preserving dense, GPU-friendly reductions. Consequently, LAMP preserves the last-iterate \(\widetilde{{O}}( nm\varepsilon^{-1})\) arithmetic complexity of conservatively parameterized primal-dual mirror prox. We further … Read more

Inertial forward-backward methods with subgradient-based corrections

Shi et al. \cite{shi2022understanding} propose acceleration methods to solve smooth convex optimization problems. In our work, we focus on the general unconstrained composite non-smooth convex optimization problem. We provide an inertial forward-backward algorithm with subgradient correction, derived through time discretization of the ODE, as studied by Shi et al. We achieve the rate of convergence … Read more

Stochastic Gradient Methods with Online Scaling

This paper introduces Stochastic Online Scaled Gradient Methods (SOSGM), a generalization of the recently developed adaptive preconditioning framework in \cite{gao2025gradient,chu2025gradient} to stochastic optimization. Under standard assumptions, we establish convergence guarantees for SOSGM using large batchsize or variance reduction. SOSGM is compatible with popular diagonal and/or low-rank preconditioners as well as heavy-ball momentum, while maintaining memory … Read more

Function-free Optimization via Comparison Oracles

In this work, we study optimization specified only through a comparison oracle: given two points, it reports which one is preferred. We call it function-free optimization because we do not assume access to, nor the existence of, a canonical application-given objective function. Instead, our goal is to find a most-preferred feasible point, which we call … Read more

Supervised feature selection via multiobjective programming and its application in the medical field

In this study, we model the supervised feature selection problem using a novel approach: convex bi-objective optimization. Traditional methods have addressed this problem by maximizing relevance to class labels and minimizing redundancy among features. Recently, Wang et al. [30] formulated this problem as a single-objective convex optimization, yielding only a unique solution. Unlike that, we … Read more

Second-Order Optimality Conditions for Bilevel Optimization Problems Using Parabolic Directional Derivatives

This paper studies the second-order properties of a class of inequality-constrained bilevel programming problems. First-order optimality conditions for the existence of solutions to bilevel optimization problems are derived using the first-order directional derivative of the optimal solution function of the lower-level problem in the seminal paper by Dempe (1992). In this work, we prove that … Read more

Convergence of the Frank-Wolfe Algorithm for Monotone Variational Inequalities

We consider the Frank-Wolfe algorithm for solving variational inequalities over compact, convex sets under a monotone \(C^1\) operator and vanishing, nonsummable step sizes. We introduce a continuous-time interpolation of the discrete iteration and use tools from dynamical systems theory to analyze its asymptotic behavior. This allows us to derive convergence results for the original discrete … Read more

Negative Momentum for Convex-Concave Optimization

This paper revisits momentum in the context of min-max optimization. Momentum is a celebrated mechanism for accelerating gradient dynamics in settings like convex minimization, but its direct use in min-max optimization makes gradient dynamics diverge. Surprisingly, Gidel et al. 2019 showed that negative momentum can help fix convergence. However, despite these promising initial results and … Read more