An algorithm for solving infinite horizon Markov dynamic programmes

We consider a general class of infinite horizon dynamic programmes where state and control sets are convex and compact subsets of Euclidean spaces and (convex) costs are discounted geometrically. The aim of this work is to provide a convergence result for these problems under as few restrictions as possible. Under certain assumptions on the cost functions, infinite horizon cost-to-go functions can be bounded by a pair of convex, Lipschitz-continuous bounding functions; we seek to refine these bounding functions until an epsilon convergence criteria is met. We prove a convergence result for a simplified version of our problem, and then apply this result for the stochastic version problem where uncertainty is governed by a discrete Markov process. Further, our algorithm is deterministic and requires no Monte-carlo simulation to estimate an upper bound on the cost of a given policy.

Citation

University of Auckland, 9th of April 2018.

Article

Download