We present a general aggregation method applicable to all finite-horizon Markov decision problems. States of the MDP are aggregated into macro-states based on a pre-selected collection of “distinguished” states which serve as entry points into macro-states. The resulting macro-problem is also an MDP, whose solution approximates an optimal solution to the original problem. The aggregation scheme also provides a method to incorporate inter-period action space constraints without loss of the Markov property.
Technical Report 04-07, University of Michigan, Industrial and Operations Engineering, 1205 Beal Ave., Ann Arbor, MI 48105, July 2004