Motivated by data-driven approaches to sequential decision-making under uncertainty, we study maximum likelihood estimation of a distribution over a general measurable space when, unlike traditional setups, realizations of the underlying uncertainty are not directly observable but instead are known to lie within observable sets. While extant work studied the special cases when the observed sets corresponded to intervals in \(\mathbb{R}^n\) for \(n=1,2\), our work provides, to the best of our knowledge, a first rigorous treatment of the more general estimation problem. Our results show that maximum likelihood estimates concentrate on a collection of maximal intersections (CMI) sets, and can be found by solving a convex optimization problem whose size is linear in the size of the CMI. After studying the efficient computation of the CMI and the maximum likelihood estimate, we characterize convergence properties of the maximum likelihood estimate and apply our results to construct ambiguity sets and develop compact formulations for Distributionally Robust and Greedy and Optimistic Optimization. Our results show how non-parametric maximum likelihood estimation can be incorporated effectively into data-driven optimization problems, resulting in tractable formulations that are tested numerically.
Maximum Likelihood Probability Measures over Sets and Applications to Data-Driven Optimization
\(\)