markov decision processes – Page 2

Multi-period Workload Balancing in Last-Mile Urban Delivery

Published: 2021/01/09

Applications - OR and Management Sciences, Transportation cost function approximation, last-mile urban delivery, markov decision processes, multi-period workload balancing

In the daily dispatching of urban deliveries, a delivery manager has to consider workload balance among the couriers to maintain workforce morale. We consider two types of workload: incentive workload, which relates to the delivery quantity and affects a courier’s income, and effort workload, which relates to the delivery time and affects a courier’s health. … Read more

Partial Policy Iteration for L1-Robust Markov Decision Processes

Published: 2020/06/18

Robust Optimization markov decision processes, robust optimization

Robust Markov decision processes (MDPs) allow to compute reliable solutions for dynamic decision problems whose evolution is modeled by rewards and partially-known transition probabilities. Unfortunately, accounting for uncertainty in the transition probabilities significantly increases the computational complexity of solving robust MDPs, which severely limits their scalability. This paper describes new efficient algorithms for solving the … Read more

Dynamic Node Packing

Published: 2020/03/29, Updated: 2020/11/07

Combinatorial Optimization, Dynamic Programming, Polyhedra dynamic programming, independent set, markov decision processes, node packing, stable set

We propose a dynamic version of the classical node packing problem, also called the stable set or independent set problem. The problem is defined by a node set, a node weight vector, and an edge probability vector. For every pair of nodes, an edge is present or not according to an independent Bernoulli random variable … Read more

Risk Aversion to Parameter Uncertainty in Markov Decision Processes with an Application to Slow-Onset Disaster Relief

Published: 2019/02/23, Updated: 2019/06/21

Integer Programming, Stochastic Programming, Supply Chain Management chance constraints, disaster relief, humanitarian supply chains, markov decision processes, parameter uncertainty, value-at-risk

In classical Markov Decision Processes (MDPs), action costs and transition probabilities are assumed to be known, although an accurate estimation of these parameters is often not possible in practice. This study addresses MDPs under cost and transition probability uncertainty and aims to provide a mathematical framework to obtain policies minimizing the risk of high long-term … Read more

Decomposition Methods for Solving Markov Decision Processes with Multiple Models of the Parameters

Published: 2018/11/30, Updated: 2021/01/20

Dynamic Programming, Stochastic Programming decomposition, dynamic programming, markov decision processes, parameter ambiguity, stochastic programming

We consider the problem of decision-making in Markov decision processes (MDPs) when the reward or transition probability parameters are not known with certainty. We consider an approach in which the decision-maker (DM) considers multiple models of the parameters for an MDP and wishes to find a policy that optimizes an objective function that considers the … Read more

Dynamic Scheduling of Home Health Care Patients to Medical Providers

Published: 2018/07/08

Applications - OR and Management Sciences, Stochastic Programming dynamic programming, home health care, l-shaped method, markov decision processes, routing

Home care provides personalized medical care and social support to patients within their own home. Our work proposes a dynamic scheduling framework to assist in the assignment of patients to health practitioners (HPs) at a single home care agency. We model the decision of which patients to assign to HPs as a discrete-time Markov decision … Read more

Envelope Theorems for Multi-Stage Linear Stochastic Optimization

Published: 2018/06/18

Stochastic Programming markov decision processes, semi-algebraic sets, sensitivity analysis, stochastic dual dynamic programming

We propose a method to compute derivatives of multi-stage linear stochastic optimization problems with respect to parameters that influence the problem’s data. Our results are based on classical envelope theorems, and can be used in problems directly solved via their deterministic equivalents as well as in stochastic dual dynamic programming for which the derivatives of … Read more

Modeling Time-dependent Randomness in Stochastic Dual Dynamic Programming

Published: 2017/09/13

Stochastic Programming coherent risk measures, dynamic programming, markov decision processes, stochastic programming

We consider the multistage stochastic programming problem where uncertainty enters the right-hand sides of the problem. Stochastic Dual Dynamic Programming (SDDP) is a popular method to solve such problems under the assumption that the random data process is stagewise independent. There exist two approaches to incorporate dependence into SDDP. One approach is to model the … Read more

Revisiting Approximate Linear Programming Using a Saddle Point Approach

Published: 2017/06/11, Updated: 2018/06/11

Dynamic Programming, Semi-infinite Programming, Stochastic Programming approximate dynamic programming, approximate linear programming, energy storage, first-order methods, inventory control, markov decision processes

Approximate linear programs (ALPs) are well-known models for computing value function approximations (VFAs) of intractable Markov decision processes (MDPs) arising in applications. VFAs from ALPs have desirable theoretical properties, define an operating policy, and provide a lower bound on the optimal policy cost, which can be used to assess the suboptimality of heuristic policies. However, … Read more

Lower Bound On the Computational Complexity of Discounted Markov Decision Problems

Published: 2017/05/20

Dynamic Programming, Linear Programming complexity, markov decision processes

We study the computational complexity of the infinite-horizon discounted-reward Markov Decision Problem (MDP) with a finite state space $\cS$ and a finite action space $\cA$. We show that any randomized algorithm needs a running time at least $\Omega(\carS^2\carA)$ to compute an $\epsilon$-optimal policy with high probability. We consider two variants of the MDP where the … Read more