cost function approximation (cfa) policy

Nonstationary Direct Policy Search for Risk-Averse Stochastic Optimization

Published: 2015/09/12, Updated: 2017/05/06

This paper presents an approach to non-stationary policy search for finite-horizon, discrete-time Markovian decision problems with large state spaces, constrained action sets, and a risk-sensitive optimality criterion. The methodology relies on modeling time variant policy parameters by a non-parametric response surface model for an indirect parametrized policy motivated by the Bellman equation. Through the interpolating … Read more