Singular perturbation techniques allow the derivation of an aggregate model whose solution is asymptotically optimal for Markov Decision Processes with strong and weak interactions. We develop an algorithm that takes advantage of the asymptotic optimality of the aggregate model in order to compute the solution of the original model with theoretically better complexity than conventional contraction algorithms. Based on our complexity analysis we show that the major benefit of aggregation is that the reduced order model is no longer ill conditioned, and the reduction in the number of states (due to aggregation) is a secondary benefit. This is a surprising result since intuition would suggest that the reduced order model can be solved more efficiently because it has fewer states. However we show that this is not necessarily the case. Our convergence analysis and numerical experiments show that the proposed algorithm can compute the optimal solution with a reduction in computational complexity without any penalty in accuracy.
Citation
Working paper, Department of Computing, Imperial College London, November 2013