Learning Optimal Classification Trees Robust to Distribution Shifts

We consider the problem of learning classification trees that are robust to distribution shifts between training and testing/deployment data. This problem arises frequently in high stakes settings such as public health and social work where data is often collected using self-reported surveys which are highly sensitive to e.g., the framing of the questions, the time … Read more

A Reciprocity Between Tree Ensemble Optimization and Multilinear Optimization

In this paper, we establish a polynomial equivalence between tree ensemble optimization and optimization of multilinear functions over the Cartesian product of simplices. We use this insight to derive new formulations for tree ensemble optimization problems and to obtain new convex hull results for multilinear polytopes. A computational experiment on multi-commodity transportation problems with costs … Read more

Shattering Inequalities for Learning Optimal Decision Trees

Recently, mixed-integer programming (MIP) techniques have been applied to learn optimal decision trees. Empirical research has shown that optimal trees typically have better out-of-sample performance than heuristic approaches such as CART. However, the underlying MIP formulations often suffer from slow runtimes, due to weak linear programming (LP) relaxations. In this paper, we first propose a … Read more

Optimal Decision Trees for Categorical Data via Integer Programming

Decision trees have been a very popular class of predictive models for decades due to their interpretability and good performance on categorical features. However, they are not always robust and tend to overfit the data. Additionally, if allowed to grow large, they lose interpretability. In this paper, we present a novel mixed integer programming formulation … Read more

Dynamic Generation of Scenario Trees

We present new algorithms for the dynamic generation of scenario trees for multistage stochastic optimization. The different methods described are based on random vectors, which are drawn from conditional distributions given the past and on sample trajectories. The structure of the tree is not determined beforehand, but dynamically adapted to meet a distance criterion, which … Read more