Projected Stochastic Momentum Methods for Nonlinear Equality-Constrained Optimization for Machine Learning

Two algorithms are proposed, analyzed, and tested for solving continuous optimization problems with nonlinear equality constraints. Each is an extension of a stochastic momentum-based method from the unconstrained setting to the setting of a stochastic Newton-SQP-type algorithm for solving equality-constrained problems. One is an extension of the heavy-ball method and the other is an extension … Read more

Contextual Distributionally Robust Optimization with Causal and Continuous Structure: An Interpretable and Tractable Approach

In this paper, we introduce a framework for contextual distributionally robust optimization (DRO) that considers the causal and continuous structure of the underlying distribution by developing interpretable and tractable decision rules that prescribe decisions using covariates. We first introduce the causal Sinkhorn discrepancy (CSD), an entropy-regularized causal Wasserstein distance that encourages continuous transport plans while … Read more

First-order Methods for Unconstrained Vector Optimization Problems: A Unified Majorization-Minimization Perspective

In this paper, we develop a unified majorization-minimization scheme and convergence analysis with first-order surrogate functions for unconstrained vector optimization problems (VOPs). By selecting different surrogate functions, the unified method can be reduced to various existing first-order methods. The unified convergence analysis reveals that the slow convergence of the steepest descent method is primarily attributed … Read more

A Gauge Set Framework for Flexible Robustness Design

This paper proposes a unified framework for designing robustness in optimization under uncertainty using gauge sets, convex sets that generalize distance and capture how distributions may deviate from a nominal reference. Representing robustness through a gauge set reweighting formulation brings many classical robustness paradigms under a single convex-analytic perspective. The corresponding dual problem, the upper … Read more

Primal-dual resampling for solution validation in convex stochastic programming

Suppose we wish to determine the quality of a candidate solution to a convex stochastic program in which the objective function is a statistical functional parameterized by the decision variable and known deterministic constraints may be present. Inspired by stopping criteria in primal-dual and interior-point methods, we develop cancellation theorems that characterize the convergence of … Read more

A Theoretical Framework for Auxiliary-Loss-Free Load Balancing of Sparse Mixture-of-Experts in Large-Scale AI Models

In large-scale AI training, Sparse Mixture-of-Experts (s-MoE) layers enable scaling by activating only a small subset of experts per token. An operational challenge in this design is load balancing: routing tokens to minimize the number of idle experts, which is important for the efficient utilization of (costly) GPUs. We provide a theoretical framework for analyzing … Read more

On Solving Chance-Constrained Models with Gaussian Mixture Distribution

We study linear chance-constrained problems where the coefficients follow a Gaussian mixture distribution. We provide mixed-binary quadratic programs that give inner and outer approximations of the chance constraint based on piecewise linear approximations of the standard normal cumulative density function. We show that $O\left(\sqrt{\ln(1/\tau)/\tau} \right)$ pieces are sufficient to attain $\tau$-accuracy in the chance constraint. … Read more

Stochastic Mixed-Integer Programming: A Survey

The goal of this survey is to provide a road-map for exploring the growing area of stochastic mixed-integer programming (SMIP) models and algorithms. We provide a comprehensive overview of existing decomposition algorithms for two-stage SMIPs, including Dantzig-Wolfe decomposition, dual decomposition, Lagrangian cuts, and decomposition approaches using parametric cutting planes and scaled cuts. Moreover, we explicitly … Read more

Stronger cuts for Benders’ decomposition for stochastic Unit Commitment Problems based on interval variables

The Stochastic Unit Commitment (SUC) problem models the scheduling of power generation units under uncertainty, typically using a two-stage stochastic program with integer first-stage and continuous second-stage variables. We propose a new Benders decomposition approach that leverages an extended formulation based on interval variables, enabling decomposition by both unit and time interval under mild technical … Read more