Wasserstein Distributionally Robust Optimization with Heterogeneous Data Sources

We study decision problems under uncertainty, where the decision-maker has access to K data sources that carry biased information about the underlying risk factors. The biases are measured by the mismatch between the risk factor distribution and the K data-generating distributions with respect to an optimal transport (OT) distance. In this situation the decision-maker can … Read more

Decision-making with Side Information: A Causal Transport Robust Approach

We consider stochastic optimization with side information where, prior to decision making, covariate data are available to inform better decisions. In particular, we propose to consider a distributionally robust formulation based on causal transport distance. Compared with divergence and Wasserstein metric, the causal transport distance is better at capturing the information structure revealed from the conditional distribution … Read more

Data-driven Multistage Distributionally Robust Linear Optimization with Nested Distance

We study multistage distributionally robust linear optimization, where the uncertainty set is defined as a ball of distribution centered at a scenario tree using the nested distance. The resulting minimax problem is notoriously difficult to solve due to its inherent non-convexity. In this paper, we demonstrate that, under mild conditions, the robust risk evaluation of … Read more

A General Wasserstein Framework for Data-driven Distributionally Robust Optimization: Tractability and Applications

Data-driven distributionally robust optimization is a recently emerging paradigm aimed at finding a solution that is driven by sample data but is protected against sampling errors. An increasingly popular approach, known as Wasserstein distributionally robust optimization (DRO), achieves this by applying the Wasserstein metric to construct a ball centred at the empirical distribution and finding … Read more

A Sparse Interior Point Method for Linear Programs arising in Discrete Optimal Transport

Discrete Optimal Transport problems give rise to very large linear programs (LP) with a particular structure of the constraint matrix. In this paper we present an interior point method (IPM) specialized for the LP originating from the Kantorovich Optimal Transport problem. Knowing that optimal solutions of such problems display a high degree of sparsity, we … Read more

A Riemannian Block Coordinate Descent Method for Computing the Projection Robust Wasserstein Distance

The Wasserstein distance has become increasingly important in machine learning and deep learning. Despite its popularity, the Wasserstein distance is hard to approximate because of the curse of dimensionality. A recently proposed approach to alleviate the curse of dimensionality is to project the sampled data from the high dimensional probability distribution onto a lower-dimensional subspace, … Read more

Mathematical Foundations of Robust and Distributionally Robust Optimization

Robust and distributionally robust optimization are modeling paradigms for decision-making under uncertainty where the uncertain parameters are only known to reside in an uncertainty set or are governed by any probability distribution from within an ambiguity set, respectively, and a decision is sought that minimizes a cost function under the most adverse outcome of the … Read more

Semi-Discrete Optimal Transport: Hardness, Regularization and Numerical Solution

Semi-discrete optimal transport problems, which evaluate the Wasserstein distance between a discrete and a generic (possibly non-discrete) probability measure, are believed to be computationally hard. Even though such problems are ubiquitous in statistics, machine learning and computer vision, however, this perception has not yet received a theoretical justification. To fill this gap, we prove that … Read more

Optimal Transport in the Face of Noisy Data

Optimal transport distances are popular and theoretically well understood in the context of data-driven prediction. A flurry of recent work has popularized these distances for data-driven decision-making as well although their merits in this context are far less well understood. This in contrast to the more classical entropic distances which are known to enjoy optimal … Read more

Shape-Constrained Regression using Sum of Squares Polynomials

We consider the problem of fitting a polynomial function to a set of data points, each data point consisting of a feature vector and a response variable. In contrast to standard polynomial regression, we require that the polynomial regressor satisfy shape constraints, such as monotonicity, Lipschitz-continuity, or convexity. We show how to use semidefinite programming … Read more