DFO: A Framework for Data-driven Decision-making with Endogenous Outliers

A typical data-driven stochastic program aims to seek the best decision that minimizes the sum of a deterministic cost function and an expected recourse function under a given distribution. Recently, much success has been witnessed in the development of Distributionally Robust Optimization (DRO), which considers the worst-case expected recourse function under the least favorable probability … Read more

Distributionally Robust Fair Transit Resource Allocation During a Pandemic

This paper studies Distributionally robust Fair transit Resource Allocation model (DrFRAM) under Wasserstein ambiguity set to optimize the public transit resource allocation during a pandemic. We show that the proposed DrFRAM is highly nonconvex and nonlinear and is, in general, NP-hard. Fortunately, we show that DrFRAM can be reformulated as a mixed-integer linear programming (MILP) … Read more

Second-Order Conic and Polyhedral Approximations of the Exponential Cone: Application to Mixed-Integer Exponential Conic Programs

Exponents and logarithms exist in many important applications such as logistic regression, maximum likelihood, relative entropy and so on. Since the exponential cone can be viewed as the epigraph of perspective of the natural exponential function or the hypograph of perspective of the natural logarithm function, many mixed-integer nonlinear convex programs involving exponential or logarithm … Read more

Beyond Symmetry: Best Submatrix Selection for the Sparse Truncated SVD

Truncated singular value decomposition (SVD), also known as the best low-rank matrix approximation, has been successfully applied to many domains such as biology, healthcare, and others, where high-dimensional datasets are prevalent. To enhance the interpretability of the truncated SVD, sparse SVD (SSVD) is introduced to select a few rows and columns of the original matrix … Read more

Unbiased Subdata Selection for Fair Classification: A Unified Framework and Scalable Algorithms

As an important problem in modern data analytics, classification has witnessed varieties of applications from different domains. Different from conventional classification approaches, fair classification concerns the issues of unintentional biases against the sensitive features (e.g., gender, race). Due to high nonconvexity of fairness measures, existing methods are often unable to model exact fairness, which can … Read more

ALSO-X and ALSO-X+: Better Convex Approximations for Chance Constrained Programs

In a chance constrained program (CCP), the decision-makers aim to seek the best decision whose probability of violating the uncertainty constraints is within the prespecified risk level. As a CCP is often nonconvex and is difficult to solve to optimality, much effort has been devoted to developing convex inner approximations for a CCP, among which … Read more

Exact and Approximation Algorithms for Sparse PCA

Sparse PCA (SPCA) is a fundamental model in machine learning and data analytics, which has witnessed a variety of application areas such as finance, manufacturing, biology, healthcare. To select a prespecified-size principal submatrix from a covariance matrix to maximize its largest eigenvalue for the better interpretability purpose, SPCA advances the conventional PCA with both feature … Read more

Regret in the Newsvendor Model with Demand and Yield Randomness

We study the fundamental stochastic newsvendor model that considers both demand and yield randomness. It is usually difficult in practice to describe precisely the joint demand and yield distribution, although partial statistical information and empirical data about this ambiguous distribution are often accessible. We combat the issue of distributional ambiguity by taking a data-driven distributionally … Read more

Distributionally Robust Bottleneck Combinatorial Problems: Uncertainty Quantification and Robust Decision Making

In a bottleneck combinatorial problem, the objective is to minimize the highest cost of elements of a subset selected from the combinatorial solution space. This paper studies data-driven distributionally robust bottleneck combinatorial problems (DRBCP) with stochastic costs, where the probability distribution of the cost vector is contained in a ball of distributions centered at the … Read more

Best Principal Submatrix Selection for the Maximum Entropy Sampling Problem: Scalable Algorithms and Performance Guarantees

This paper studies a classic maximum entropy sampling problem (MESP), which aims to select the most informative principal submatrix with a given size out of a covariance matrix from a system. MESP has been widely applied to many areas, including healthcare, power system, manufacturing, data science, etc. Investigating its Lagrangian dual and primal characterization, we … Read more