Optimization in Data Science – Page 6

Finding Regions of Counterfactual Explanations via Robust Optimization

Published: 2023/05/16

Optimization in Data Science, Robust Optimization counterfactual explanation, explainable AI, machine learning, robust optimization

Counterfactual explanations play an important role in detecting bias and improving the explainability of data-driven classification models. A counterfactual explanation (CE) is a minimal perturbed data point for which the decision of the model changes. Most of the existing methods can only provide one CE, which may not be achievable for the user. In this … Read more

Maximum Likelihood Probability Measures over Sets and Applications to Data-Driven Optimization

Published: 2023/05/15

Juan Borrero

Denis Saure

Applications - OR and Management Sciences, Optimization in Data Science, Robust Optimization data-driven decision making, distributionally robust optimization, maximum likelihood estimation

\(\) Motivated by data-driven approaches to sequential decision-making under uncertainty, we study maximum likelihood estimation of a distribution over a general measurable space when, unlike traditional setups, realizations of the underlying uncertainty are not directly observable but instead are known to lie within observable sets. While extant work studied the special cases when the observed … Read more

Optimized Dimensionality Reduction for Moment-based Distributionally Robust Optimization

Published: 2023/05/06

Optimization in Data Science, Robust Optimization, Semi-definite Programming data-driven optimization, dimensionality reduction, distributionally robust optimization, principal component analysis, semidefinite programming

Moment-based distributionally robust optimization (DRO) provides an optimization framework to integrate statistical information with traditional optimization approaches. Under this framework, one assumes that the underlying joint distribution of random parameters runs in a distributional ambiguity set constructed by moment information and makes decisions against the worst-case distribution within the set. Although most moment-based DRO problems … Read more

When Deep Learning Meets Polyhedral Theory: A Survey

Published: 2023/04/29, Updated: 2023/08/31

(Mixed) Integer Linear Programming, Optimization in Data Science, Polyhedra

In the past decade, deep learning became the prevalent methodology for predictive modeling thanks to the remarkable accuracy of deep neural networks in tasks such as computer vision and natural language processing. Meanwhile, the structure of neural networks converged back to simpler representations based on piecewise constant and piecewise linear functions such as the Rectified … Read more

A Stochastic-Gradient-based Interior-Point Algorithm for Solving Smooth Bound-Constrained Optimization Problems

Published: 2023/04/28

Qi Wang

Frank E. Curtis

Daniel P. Robinson

Vyacheslav Kungurtsev

Constrained Nonlinear Optimization, Nonlinear Optimization, Optimization in Data Science interior point method, stochastic gradient, stochastic optimization

A stochastic-gradient-based interior-point algorithm for minimizing a continuously differentiable objective function (that may be nonconvex) subject to bound constraints is presented, analyzed, and demonstrated through experimental results. The algorithm is unique from other interior-point methods for solving smooth (nonconvex) optimization problems since the search directions are computed using stochastic gradient estimates. It is also unique … Read more

Balancing Communication and Computation in Gradient Tracking Algorithms for Decentralized Optimization

Published: 2023/03/27

Albert S. Berahas

Raghu Bollapragada

Shagun Gupta

Convex Optimization, Nonlinear Optimization, Optimization in Data Science communication, computation, decentralized optimization, Gradient Tracking Methods, network optimization

Gradient tracking methods have emerged as one of the most popular approaches for solving decentralized optimization problems over networks. In this setting, each node in the network has a portion of the global objective function, and the goal is to collectively optimize this function. At every iteration, gradient tracking methods perform two operations (steps): (1) … Read more

The Online Shortest Path Problem: Learning Travel Times Using A Multi-Armed Bandit Framework

Published: 2023/03/22

Tomas Lagos

Ramon Auad

Felipe Lagos

Applications - OR and Management Sciences, Data Science Algorithms, Transportation kriging, last mile logistics, machine learning, Multi-Armed Bandits, Online Shortest Path, Thompson Sampling

In the age of e-commerce, many logistic companies must operate in large road networks without accurate knowledge of travel times for their specific fleet of vehicles. Moreover, millions of dollars are spent on routing services that do not accurately capture the specific characteristics of the companies’ drivers and the types of vehicles they must use. … Read more

Mixed-Integer Quadratic Optimization and Iterative Clustering Techniques for Semi-Supervised Support Vector Machines

Published: 2023/03/22, Updated: 2023/10/02

Martin Schmidt

Jan Pablo Burgard

Maria Eduarda Pinheiro

(Mixed) Integer Nonlinear Programming, Data Science Algorithms, Data Science Applications clustering, mixed-integer quadratic optimization, Semi-Supervised Learning, support vector machines

Among the most famous algorithms for solving classification problems are support vector machines (SVMs), which find a separating hyperplane for a set of labeled data points. In some applications, however, labels are only available for a subset of points. Furthermore, this subset can be non-representative, e.g., due to self-selection in a survey. Semi-supervised SVMs tackle … Read more

On the Optimization Landscape of Burer-Monteiro Factorization: When do Global Solutions Correspond to Ground Truth?

Published: 2023/02/21

Jianhao Ma

Salar Fattahi

Global Optimization, Nonlinear Optimization, Optimization in Data Science low-rank matrix recovery, nonconvex optimization, nonsmooth optimization

In low-rank matrix recovery, the goal is to recover a low-rank matrix, given a limited number of linear and possibly noisy measurements. Low-rank matrix recovery is typically solved via a nonconvex method called Burer-Monteiro factorization (BM). If the rank of the ground truth is known, BM is free of sub-optimal local solutions, and its true solutions … Read more

Variable Selection for Kernel Two-Sample Tests

Published: 2023/02/15, Updated: 2023/10/12

Jie Wang

Santanu S. Dey

Yao Xie

(Mixed) Integer Nonlinear Programming, Optimization in Data Science maximum mean discrepancy, mixed-integer programming, Two-sample test

We consider the variable selection problem for two-sample tests, aiming to select the most informative variables to distinguish samples from two groups. To solve this problem, we propose a framework based on the kernel maximum mean discrepancy (MMD). Our approach seeks a group of variables with a pre-specified size that maximizes the variance-regularized MMD statistics. … Read more