Provable and Practical Online Learning Rate Adaptation with Hypergradient Descent

This paper investigates the convergence properties of the hypergradient descent method (HDM), a 25-year-old heuristic originally proposed for adaptive stepsize selection in stochastic first-order methods. We provide the first rigorous convergence analysis of HDM using the online learning framework of [Gao24] and apply this analysis to develop new state-of-the-art adaptive gradient methods with empirical and … Read more

Mean and variance estimation complexity in arbitrary distributions via Wasserstein minimization

Parameter estimation is a fundamental challenge in machine learning, crucial for tasks such as neural network weight fitting and Bayesian inference. This paper focuses on the complexity of estimating translation μ∈R^l and shrinkage σ∈R++ parameters for a distribution of the form (1/sigma^l) f_0((x−μ)/σ), where f_0 is a known density in R^l given n samples. We … Read more

An Augmented Lagrangian Approach to Bi-Level Optimization via an Equilibrium Constrained Problem

Optimization problems involving equilibrium constraints capture diverse optimization settings such as bi-level optimization, min-max problems and games, and the minimization over non-linear constraints. This paper introduces an Augmented Lagrangian approach with Hessian-vector product approximation to address an equilibrium constrained nonconvex nonsmooth optimization problem. The underlying model in particular captures various settings of bi-level optimization problems, … Read more

Convergence of Descent Optimization Algorithms under Polyak-Lojasiewicz-Kurdyka Conditions

This paper develops a comprehensive convergence analysis for generic classes of descent algorithms in nonsmooth and nonconvex optimization under several conditions of the Polyak-Lojasiewicz-Kurdyka (PLK) type. Along other results, we prove the finite termination of generic algorithms under the PLK conditions with lower exponents. Specifications are given to establish new convergence rates for inexact reduced … Read more

prunAdag: an adaptive pruning-aware gradient method

A pruning-aware adaptive gradient method is proposed which classifies the variables in two sets before updating them using different strategies. This technique extends the “relevant/irrelevant” approach of Ding (2019) and Zimmer et al. (2022) and allows a posteriori sparsification of the solution of model parameter fitting problems. The new method is proved to be convergent … Read more

A necessary condition for the guarantee of the superiorization method

We study a method that involves principally convex feasibility-seeking and makes secondary efforts of objective function value reduction. This is the well-known superiorization method (SM), where the iterates of an asymptotically convergent iterative feasibility-seeking algorithm are perturbed by objective function nonascent steps. We investigate the question under what conditions a sequence generated by an SM … Read more

Facial structure of copositive and completely positive cones over a second-order cone

We classify the faces of copositive and completely positive cones over a second-order cone and investigate their dimension and exposedness properties. Then we compute two parameters related to chains of faces of both cones. At the end, we discuss some possible extensions of the results with a view toward analyzing the facial structure of general … Read more

An Oracle-based Approach for Price-setting Problems in Logistics

We study a bilevel hub location problem where on the upper level, a shipment service provider –the leader–builds a transportation network and sets the prices of shipments on each possible transportation relation. Here, the leader has to take into account the customers’ reaction — the follower — who will only purchase transport services depending on … Read more

Coherent Local Explanations for Mathematical Optimization

The surge of explainable artificial intelligence methods seeks to enhance transparency and explainability in machine learning models. At the same time, there is a growing demand for explaining decisions taken through complex algorithms used in mathematical optimization. However, current explanation methods do not take into account the structure of the underlying optimization problem, leading to … Read more

Newtonian Methods with Wolfe Linesearch in Nonsmooth Optimization and Machine Learning

This paper introduces and develops coderivative-based Newton methods with Wolfe linesearch conditions to solve various classes of problems in nonsmooth optimization and machine learning. We first propose a generalized regularized Newton method with Wolfe linesearch (GRNM-W) for unconstrained $C^{1,1}$ minimization problems (which are second-order nonsmooth) and establish global as well as local superlinear convergence of … Read more