# Near-Optimal Ambiguity sets for Distributionally Robust Optimization

We propose a novel, Bayesian framework for assessing the relative strengths of data-driven ambiguity sets in distributionally robust optimization (DRO). The key idea is to measure the relative size between a candidate ambiguity set and an \emph{asymptotically optimal} set as the amount of data grows large. This asymptotically optimal set is provably the smallest convex ambiguity set that satisfies a specific, Bayesian robustness guarantee, i.e., it is a subset of any other convex set that also satisfies this guarantee. Perhaps surprisingly, we show what existing, popular ambiguity set proposals that are based on statistical confidence regions are necessarily significantly larger than this asymptotically optimal set; the ratio of their sizes scales with the square root of the dimension of the ambiguity. These results suggest that current DRO models utilizing these popular ambiguity sets are unnecessarily conservative. Consequently, we also propose a new class of ambiguity sets which satisfy our Bayesian robustness guarantee, are tractable, enjoy the usual asymptotic convergence properties, and, most importantly, are only a small, explicitly known factor larger than the asymptotically optimal set. We discuss extensively how these results give rise to simple guidelines for practitioners with respect to selecting ambiguity sets and formulating DRO models, with special emphasis on the case of ambiguity sets for finite, discrete probability vectors. Computational evidence in portfolio allocation using real and simulated data confirm that these theoretical framework and results provide useful, practical insight into the empirical performance of DRO models in real applications, and that our new near-optimal sets outperform their traditional confidence region variants.

## Citation

Accepted in Management Science.