Decision-making with Side Information: A Causal Transport Robust Approach

We consider stochastic optimization with side information where, prior to decision making, covariate data are available to inform better decisions. In particular, we propose to consider a distributionally robust formulation based on causal transport distance. Compared with divergence and Wasserstein metric, the causal transport distance is better at capturing the information structure revealed from the conditional distribution … Read more

Data-driven Multistage Distributionally Robust Optimization with Nested Distance: Time Consistency and Tractable Dynamic Reformulations

We study multistage distributionally robust optimization in which the uncertainty set consists of stochastic processes that are close to a scenario tree in the nested distance (Pflug and Pichler (2012)). Compared to other choices such as Wasserstein distance between stochastic processes, the nested distance accounts for information evolution, making it hedge against a plausible family … Read more

A General Wasserstein Framework for Data-driven Distributionally Robust Optimization: Tractability and Applications

Data-driven distributionally robust optimization is a recently emerging paradigm aimed at finding a solution that is driven by sample data but is protected against sampling errors. An increasingly popular approach, known as Wasserstein distributionally robust optimization (DRO), achieves this by applying the Wasserstein metric to construct a ball centred at the empirical distribution and finding … Read more

A Sparse Interior Point Method for Linear Programs arising in Discrete Optimal Transport

Discrete Optimal Transport problems give rise to very large linear programs (LP) with a particular structure of the constraint matrix. In this paper we present an interior point method (IPM) specialized for the LP originating from the Kantorovich Optimal Transport problem. Knowing that optimal solutions of such problems display a high degree of sparsity, we … Read more

A Riemannian Block Coordinate Descent Method for Computing the Projection Robust Wasserstein Distance

The Wasserstein distance has become increasingly important in machine learning and deep learning. Despite its popularity, the Wasserstein distance is hard to approximate because of the curse of dimensionality. A recently proposed approach to alleviate the curse of dimensionality is to project the sampled data from the high dimensional probability distribution onto a lower-dimensional subspace, … Read more

Mathematical Foundations of Robust and Distributionally Robust Optimization

Robust and distributionally robust optimization are modeling paradigms for decision-making under uncertainty where the uncertain parameters are only known to reside in an uncertainty set or are governed by any probability distribution from within an ambiguity set, respectively, and a decision is sought that minimizes a cost function under the most adverse outcome of the … Read more

Semi-Discrete Optimal Transport: Hardness, Regularization and Numerical Solution

Semi-discrete optimal transport problems, which evaluate the Wasserstein distance between a discrete and a generic (possibly non-discrete) probability measure, are believed to be computationally hard. Even though such problems are ubiquitous in statistics, machine learning and computer vision, however, this perception has not yet received a theoretical justification. To fill this gap, we prove that … Read more

Optimal Transport in the Face of Noisy Data

Optimal transport distances are popular and theoretically well understood in the context of data-driven prediction. A flurry of recent work has popularized these distances for data-driven decision-making as well although their merits in this context are far less well understood. This in contrast to the more classical entropic distances which are known to enjoy optimal … Read more

Shape-Constrained Regression using Sum of Squares Polynomials

We consider the problem of fitting a polynomial function to a set of data points, each data point consisting of a feature vector and a response variable. In contrast to standard polynomial regression, we require that the polynomial regressor satisfy shape constraints, such as monotonicity, Lipschitz-continuity, or convexity. We show how to use semidefinite programming … Read more

Optimal Transport Based Distributionally Robust Optimization: Structural Properties and Iterative Schemes

We consider optimal transport based distributionally robust optimization (DRO) problems with locally strongly convex transport cost functions and affine decision rules. Under conventional convexity assumptions on the underlying loss function, we obtain structural results about the value function, the optimal policy, and the worst-case optimal transport adversarial model. These results expose a rich structure embedded … Read more