Mixed-Feature Logistic Regression Robust to Distribution Shifts

Logistic regression models are widely used in the social and behavioral sciences and in high-stakes domains, due to their simplicity and interpretability properties. At the same time, such domains are permeated by distribution shifts, where the distribution generating the data changes between training and deployment. In this paper, we study a distributionally robust logistic regression … Read more

Stability Regularized Cross-Validation

We revisit the problem of ensuring strong test-set performance via cross-validation. Motivated by the generalization theory literature, we propose a nested k-fold cross- validation scheme that selects hyperparameters by minimizing a weighted sum of the usual cross-validation metric and an empirical model-stability measure. The weight on the stability term is itself chosen via a nested … Read more

A Dantzig-Wolfe Decomposition Method for Quasi-Variational Inequalities

We propose an algorithm to solve quasi-variational inequality problems, based on the Dantzig-Wolfe decomposition paradigm. Our approach solves in the subproblems variational inequalities, which is a simpler problem, while restricting quasi-variational inequalities in the master subproblems, making them generally much smaller in size when the original problem is large-scale. We prove global convergence of our … Read more

Multi-cut stochastic approximation methods for solving stochastic convex composite optimization

This paper considers the stochastic convex composite optimization problem and presents multi-cut stochastic approximation (SA) methods for solving it, whose models in expectation overestimate its objective function. The multi-cut model obtained by taking the maximum of a finite number of linearizations of the stochastic objective function provides a biased estimate of the objective function, with … Read more

Stability analysis for two-level value functions and application to numerically solve a pessimistic bilevel program

Some stability results are presented for a two-level value function, which is the optimal value function of a parametric optimization problem constrained by the optimal solution set of another parameteric optimization problem. It is then shown how to use these stability results to write down (and subsequently compute) stationary points for a pessimistic bilevel optimization … Read more

Inverse Optimization via Learning Feasible Regions

We study inverse optimization (IO), where the goal is to use a parametric optimization program as the hypothesis class to infer relationships between input-decision pairs. Most of the literature focuses on learning only the objective function, as learning the constraint function (i.e., feasible regions) leads to nonconvex training programs. Motivated by this, we focus on … Read more

cuHALLaR: A GPU accelerated low-rank augmented Lagrangian method for large-scale semidefinite programming

This paper introduces cuHALLaR, a GPU-accelerated implementation of the HALLaR method proposed in Monteiro et al. 2024 for solving large-scale semidefinite programming (SDP) problems. We demonstrate how our Julia-based implementation efficiently uses GPU parallelism through optimization of simple, but key, operations, including linear maps, adjoints, and gradient evaluations. Extensive numerical experiments across three problem classes—maximum … Read more

Subgradient Regularization: A Descent-Oriented Subgradient Method for Nonsmooth Optimization

In nonsmooth optimization, a negative subgradient is not necessarily a descent direction, making the design of convergent descent methods based on zeroth-order and first-order information a challenging task. The well-studied bundle methods and gradient sampling algorithms construct descent directions by aggregating subgradients at nearby points in seemingly different ways, and are often complicated or lack … Read more

An Adaptive and Parameter-Free Nesterov’s Accelerated Gradient Method for Convex Optimization

We propose AdaNAG, an adaptive accelerated gradient method based on Nesterov’s accelerated gradient method. AdaNAG is line-search-free, parameter-free, and achieves the accelerated convergence rates \( f(x_k) – f_\star = \mathcal{O}\left(1/k^2\right) \) and \( \min_{i\in\left\{1,\dots, k\right\}} \|\nabla f(x_i)\|^2 = \mathcal{O}\left(1/k^3\right) \) for \( L \)-smooth convex function \( f \). We provide a Lyapunov analysis for … Read more

Convergence of Mean-Field Langevin Stochastic Descent-Ascent for Distributional Minimax Optimization

We study convergence properties of the discrete-time Mean-Field Langevin Stochastic Descent-Ascent (MFL-SDA) algorithm for solving distributional minimax optimization. These problems arise in various applications, such as zero-sum games, generative adversarial networks and distributionally robust learning. Despite the significance of MFL-SDA in these contexts, the discrete-time convergence rate remains underexplored. To address this gap, we establish … Read more