Reliable Off-policy Evaluation for Reinforcement Learning

In a sequential decision-making problem, off-policy evaluation estimates the expected cumulative reward of a target policy using logged trajectory data generated from a different behavior policy, without execution of the target policy. Reinforcement learning in high-stake environments, such as healthcare and education, is often limited to off-policy settings due to safety or ethical concerns, or … Read more

A Robust Approach for Modeling Limited Observability in Bilevel Optimization

In bilevel optimization, hierarchical optimization problems are considered in which two players – the leader and the follower – act and react in a non-cooperative and sequential manner. In many real-world applications, the leader may face a follower whose reaction deviates from the one expected by the leader due to some kind of bounded rationality. … Read more

Heteroscedasticity-aware residuals-based contextual stochastic optimization

We explore generalizations of some integrated learning and optimization frameworks for data-driven contextual stochastic optimization that can adapt to heteroscedasticity. We identify conditions on the stochastic program, data generation process, and the prediction setup under which these generalizations possess asymptotic and finite sample guarantees for a class of stochastic programs, including two-stage stochastic mixed-integer programs … Read more

First-order algorithms for robust optimization problems via convex-concave saddle-point Lagrangian reformulation

Robust optimization (RO) is one of the key paradigms for solving optimization problems affected by uncertainty. Two principal approaches for RO, the robust counterpart method and the adversarial approach, potentially lead to excessively large optimization problems. For that reason, first order approaches, based on online-convex-optimization, have been proposed (Ben-Tal et al. (2015), Kilinc-Karzan and Ho-Nguyen … Read more

Model-Free Assortment Pricing with Transaction Data

We study a problem in which a firm sets prices for products based on the transaction data, i.e., which product past customers chose from an assortment and what were the historical prices that they observed. Our approach does not impose a model on the distribution of the customers’ valuations and only assumes, instead, that purchase … Read more

Distributionally robust second-order stochastic dominance constrained optimization with Wasserstein distance

We consider a distributionally robust second-order stochastic dominance constrained optimization problem. We require the dominance constraints hold with respect to all probability distributions in a Wasserstein ball centered at the empirical distribution. We adopt the sample approximation approach to develop a linear programming formulation that provides a lower bound. We propose a novel split-and-dual decomposition … Read more

The Landscape of the Proximal Point Method for Nonconvex-Nonconcave Minimax Optimization

Minimax optimization has become a central tool for modern machine learning with applications in generative adversarial networks, robust optimization, reinforcement learning, etc. These applications are often nonconvex-nonconcave, but the existing theory is unable to identify and deal with the fundamental difficulties posed by nonconvex-nonconcave structures. In this paper, we study the classic proximal point method … Read more

Data-Driven Optimization with Distributionally Robust Second-Order Stochastic Dominance Constraints

Optimization with stochastic dominance constraints has recently received an increasing amount of attention in the quantitative risk management literature. Instead of requiring that the probabilistic description of the uncertain parameters be exactly known, this paper presents the first comprehensive study of a data-driven formulation of the distributionally robust second-order stochastic dominance constrained problem (DRSSDCP) that … Read more

Kernel Distributionally Robust Optimization

We propose kernel distributionally robust optimization (Kernel DRO) using insights from the robust optimization theory and functional analysis. Our method uses reproducing kernel Hilbert spaces (RKHS) to construct a wide range of convex ambiguity sets, including sets based on integral probability metrics and finite-order moment bounds. This perspective unifies multiple existing robust and stochastic optimization … Read more

Pareto Adaptive Robust Optimality via a Fourier-Motzkin Elimination Lens

We formalize the concept of Pareto Adaptive Robust Optimality (PARO) for linear Adaptive Robust Optimization (ARO) problems. A worst-case optimal solution pair of here-and-now decisions and wait-and-see decisions is PARO if it cannot be Pareto dominated by another solution, i.e., there does not exist another such pair that performs at least as good in all … Read more