Risk-Averse Dynamic Programming for Markov Decision Processes

We introduce the concept of a Markov risk measure and we use it to formulate risk-averse control problems for two Markov decision models: a finite horizon model and a discounted infinite horizon model. For both models we derive risk-averse dynamic programming equations and a value iteration method. For the infinite horizon problem we also develop a risk-averse policy iteration method and we prove its convergence. Finally, we propose a version of the Newton method to solve a nonsmooth equation arising in the policy iteration method and we prove its global convergence.

Citation

Presented at the 20th International Symposium on Mathematical Programming, Chicago, August 23-28, 2009; appeared in Mathematical Programming, Series B, 2010; On-Line First, DOI: 10.1007/s10107-010-0393-3

Article

Download

View PDF