markov decision processes – Optimization Online

MDP modeling for multi-stage stochastic programs

Published: 2025/09/26

Stochastic Programming decision-dependent uncertainty, markov decision processes, multi-stage stochastic programming, policy graph, statistical learning

We study a class of multi-stage stochastic programs, which incorporate modeling features from Markov decision processes (MDPs). This class includes structured MDPs with continuous state and action spaces. We extend policy graphs to include decision-dependent uncertainty for one-step transition probabilities as well as a limited form of statistical learning. We focus on the expressiveness of … Read more

From Optimization to Control: Quasi Policy Iteration

Published: 2023/11/27, Updated: 2025/08/26

Mohamad Amin Sharifi Kolarijani

Peyman Mohajerin Esfahani

Convex Optimization, Dynamic Programming dynamic programming, markov decision processes, optimization algorithms, quasi-newton methods, reinforcement learning

Recent control algorithms for Markov decision processes (MDPs) have been designed using an implicit analogy with well-established optimization algorithms. In this paper, we adopt the quasi-Newton method (QNM) from convex optimization to introduce a novel control algorithm coined as quasi-policy iteration (QPI). In particular, QPI is based on a novel approximation of the “Hessian” matrix … Read more

Markov Decision Process Design: A Framework for Integrating Strategic and Operational Decisions

Published: 2023/04/06, Updated: 2023/12/30

Seth Brown

Saumya Sinha

Andrew J. Schaefer

Integer Programming, Nonlinear Optimization bilevel optimization, design optimization, markov decision processes

We consider the problem of optimally designing a system for repeated use under uncertainty. We develop a modeling framework that integrates design and operational phases, which are represented by a mixed-integer program and discounted-cost infinite-horizon Markov decision processes, respectively. We seek to simultaneously minimize the design costs and the subsequent expected operational costs. This problem … Read more

Distributionally robust chance-constrained Markov decision processes

Published: 2022/12/15, Updated: 2023/01/03

Hoang Nam Nguyen

Abdel Lisser

Vikas Vikram Singh

Stochastic Programming Biconvex optimization, Copositive optimization, Distributionally robust chance-constrained optimization, markov decision processes, Mix-integer second-order cone programming, second-order cone programming

Markov decision process (MDP) is a decision making framework where a decision maker is interested in maximizing the expected discounted value of a stream of rewards received at future stages at various states which are visited according to a controlled Markov chain. Many algorithms including linear programming methods are available in the literature to compute … Read more

Robust Phi-Divergence MDPs

Published: 2022/05/27, Updated: 2023/01/12

Chin Pang Ho

Marek Petrik

Wolfram Wiesemann

Dynamic Programming, Robust Optimization markov decision processes, phi-divergences, robust optimization

In recent years, robust Markov decision processes (MDPs) have emerged as a prominent modeling framework for dynamic decision problems affected by uncertainty. In contrast to classical MDPs, which only account for stochasticity by modeling the dynamics through a stochastic process with a known transition kernel, robust MDPs additionally account for ambiguity by optimizing in view … Read more

Data-Driven Ranges of Near-Optimal Actions for Finite Markov Decision Processes

Published: 2021/10/05

Applications - OR and Management Sciences, Applications - Science and Engineering, Dynamic Programming cardiovascular diseases, health policy, markov decision processes, medical decision making, simulation, statistical multiple comparisons

Markov decision process (MDP) models have been used to obtain non-stationary optimal decision rules in various applications, such as treatment planning in medical decision making. However, in practice, decision makers may prefer other strategies that are not statistically different from the optimal decision rules. To benefit from the decision makers’ expertise and provide flexibility in … Read more

Approximate Dynamic Programming for Crowd-shipping with In-store Customers

Published: 2021/09/27, Updated: 2021/11/18

Civil and Environmental Engineering, Dynamic Programming, Transportation approximate dynamic programming, crowd-shipping, last-mile delivery, markov decision processes, value function approximation

Crowd-shipping has gained significant attention as a last-mile delivery option over the recent years. In this study, we propose a variant of dynamic crowd-shipping model with in-store customers as crowd-shippers to deliver online orders within few hours. We formulate the problem as a Markov decision process and develop an approximate dynamic programming (ADP) policy using … Read more

Interpretable Policies and the Price of Interpretability in Hypertension Treatment Planning

Published: 2021/08/09, Updated: 2022/08/02

Applications - OR and Management Sciences, Applications - Science and Engineering, Dynamic Programming cardiovascular disease, healthcare applications, interpretability, markov decision processes, medical decision making, personalized treatment planning

Problem definition: Effective hypertension management is critical to reducing consequences of atherosclerotic cardiovascular disease, a leading cause of death in the United States. Clinical guidelines for hypertension can be enhanced using decision-analytic approaches, capable of capturing many complexities in treatment planning. However, model-generated recommendations may be uninterpretable/unintuitive, limiting their acceptability in practice. We address this … Read more

Distributionally Robust Optimal Control and MDP Modeling

Published: 2021/04/28, Updated: 2021/08/14

Alexander Shapiro

Robust Optimization, Stochastic Programming bellman equations, distributional robustness, duality, dynamic programming, markov decision processes, optimal control, rectangularity, risk measures, stochastic games

In this paper, we discuss Optimal Control and Markov Decision Process (MDP) formulations of multistage optimization problems when the involved probability distributions are not known exactly, but rather are assumed to belong to specified ambiguity families. The aim of this paper is to clarify a connection between such distributionally robust approaches to multistage stochastic optimization. … Read more

Multi-period Workload Balancing in Last-Mile Urban Delivery

Published: 2021/01/09

Applications - OR and Management Sciences, Transportation cost function approximation, last-mile urban delivery, markov decision processes, multi-period workload balancing

In the daily dispatching of urban deliveries, a delivery manager has to consider workload balance among the couriers to maintain workforce morale. We consider two types of workload: incentive workload, which relates to the delivery quantity and affects a courier’s income, and effort workload, which relates to the delivery time and affects a courier’s health. … Read more