From Optimization to Control: Quasi Policy Iteration

Recent control algorithms for Markov decision processes (MDPs) have been designed using an implicit analogy with well-established optimization algorithms. In this paper, we make this analogy explicit across four problem classes with a unified solution characterization. This novel framework, in turn, allows for a systematic transformation of algorithms from one domain to the other. In … Read more

Markov Decision Process Design: A Framework for Integrating Strategic and Operational Decisions

We consider the problem of optimally designing a system for repeated use under uncertainty. We develop a modeling framework that integrates design and operational phases, which are represented by a mixed-integer program and discounted-cost infinite-horizon Markov decision processes, respectively. We seek to simultaneously minimize the design costs and the subsequent expected operational costs. This problem … Read more

Distributionally robust chance-constrained Markov decision processes

Markov decision process (MDP) is a decision making framework where a decision maker is interested in maximizing the expected discounted value of a stream of rewards received at future stages at various states which are visited according to a controlled Markov chain. Many algorithms including linear programming methods are available in the literature to compute … Read more

Robust Phi-Divergence MDPs

In recent years, robust Markov decision processes (MDPs) have emerged as a prominent modeling framework for dynamic decision problems affected by uncertainty. In contrast to classical MDPs, which only account for stochasticity by modeling the dynamics through a stochastic process with a known transition kernel, robust MDPs additionally account for ambiguity by optimizing in view … Read more

Data-Driven Ranges of Near-Optimal Actions for Finite Markov Decision Processes

Markov decision process (MDP) models have been used to obtain non-stationary optimal decision rules in various applications, such as treatment planning in medical decision making. However, in practice, decision makers may prefer other strategies that are not statistically different from the optimal decision rules. To benefit from the decision makers’ expertise and provide flexibility in … Read more

Approximate Dynamic Programming for Crowd-shipping with In-store Customers

Crowd-shipping has gained significant attention as a last-mile delivery option over the recent years. In this study, we propose a variant of dynamic crowd-shipping model with in-store customers as crowd-shippers to deliver online orders within few hours. We formulate the problem as a Markov decision process and develop an approximate dynamic programming (ADP) policy using … Read more

Interpretable Policies and the Price of Interpretability in Hypertension Treatment Planning

Problem definition: Effective hypertension management is critical to reducing consequences of atherosclerotic cardiovascular disease, a leading cause of death in the United States. Clinical guidelines for hypertension can be enhanced using decision-analytic approaches, capable of capturing many complexities in treatment planning. However, model-generated recommendations may be uninterpretable/unintuitive, limiting their acceptability in practice. We address this … Read more

Distributionally Robust Optimal Control and MDP Modeling

In this paper, we discuss Optimal Control and Markov Decision Process (MDP) formulations of multistage optimization problems when the involved probability distributions are not known exactly, but rather are assumed to belong to specified ambiguity families. The aim of this paper is to clarify a connection between such distributionally robust approaches to multistage stochastic optimization. … Read more

Multi-period Workload Balancing in Last-Mile Urban Delivery

In the daily dispatching of urban deliveries, a delivery manager has to consider workload balance among the couriers to maintain workforce morale. We consider two types of workload: incentive workload, which relates to the delivery quantity and affects a courier’s income, and effort workload, which relates to the delivery time and affects a courier’s health. … Read more

Partial Policy Iteration for L1-Robust Markov Decision Processes

Robust Markov decision processes (MDPs) allow to compute reliable solutions for dynamic decision problems whose evolution is modeled by rewards and partially-known transition probabilities. Unfortunately, accounting for uncertainty in the transition probabilities significantly increases the computational complexity of solving robust MDPs, which severely limits their scalability. This paper describes new efficient algorithms for solving the … Read more