MDP modeling for multi-stage stochastic programs

We study a class of multi-stage stochastic programs, which incorporate modeling features from Markov decision processes (MDPs). This class includes structured MDPs with continuous state and action spaces. We extend policy graphs to include decision-dependent uncertainty for one-step transition probabilities as well as a limited form of statistical learning. We focus on the expressiveness of … Read more

From Optimization to Control: Quasi Policy Iteration

Recent control algorithms for Markov decision processes (MDPs) have been designed using an implicit analogy with well-established optimization algorithms. In this paper, we adopt the quasi-Newton method (QNM) from convex optimization to introduce a novel control algorithm coined as quasi-policy iteration (QPI). In particular, QPI is based on a novel approximation of the “Hessian” matrix … Read more

Markov Decision Process Design: A Framework for Integrating Strategic and Operational Decisions

We consider the problem of optimally designing a system for repeated use under uncertainty. We develop a modeling framework that integrates design and operational phases, which are represented by a mixed-integer program and discounted-cost infinite-horizon Markov decision processes, respectively. We seek to simultaneously minimize the design costs and the subsequent expected operational costs. This problem … Read more

Distributionally robust chance-constrained Markov decision processes

Markov decision process (MDP) is a decision making framework where a decision maker is interested in maximizing the expected discounted value of a stream of rewards received at future stages at various states which are visited according to a controlled Markov chain. Many algorithms including linear programming methods are available in the literature to compute … Read more

Robust Phi-Divergence MDPs

In recent years, robust Markov decision processes (MDPs) have emerged as a prominent modeling framework for dynamic decision problems affected by uncertainty. In contrast to classical MDPs, which only account for stochasticity by modeling the dynamics through a stochastic process with a known transition kernel, robust MDPs additionally account for ambiguity by optimizing in view … Read more

Data-Driven Ranges of Near-Optimal Actions for Finite Markov Decision Processes

Markov decision process (MDP) models have been used to obtain non-stationary optimal decision rules in various applications, such as treatment planning in medical decision making. However, in practice, decision makers may prefer other strategies that are not statistically different from the optimal decision rules. To benefit from the decision makers’ expertise and provide flexibility in … Read more

Approximate Dynamic Programming for Crowd-shipping with In-store Customers

Crowd-shipping has gained significant attention as a last-mile delivery option over the recent years. In this study, we propose a variant of dynamic crowd-shipping model with in-store customers as crowd-shippers to deliver online orders within few hours. We formulate the problem as a Markov decision process and develop an approximate dynamic programming (ADP) policy using … Read more

Interpretable Policies and the Price of Interpretability in Hypertension Treatment Planning

Problem definition: Effective hypertension management is critical to reducing consequences of atherosclerotic cardiovascular disease, a leading cause of death in the United States. Clinical guidelines for hypertension can be enhanced using decision-analytic approaches, capable of capturing many complexities in treatment planning. However, model-generated recommendations may be uninterpretable/unintuitive, limiting their acceptability in practice. We address this … Read more

Distributionally Robust Optimal Control and MDP Modeling

In this paper, we discuss Optimal Control and Markov Decision Process (MDP) formulations of multistage optimization problems when the involved probability distributions are not known exactly, but rather are assumed to belong to specified ambiguity families. The aim of this paper is to clarify a connection between such distributionally robust approaches to multistage stochastic optimization. … Read more

Multi-period Workload Balancing in Last-Mile Urban Delivery

In the daily dispatching of urban deliveries, a delivery manager has to consider workload balance among the couriers to maintain workforce morale. We consider two types of workload: incentive workload, which relates to the delivery quantity and affects a courier’s income, and effort workload, which relates to the delivery time and affects a courier’s health. … Read more