Reinforcement Learning via Parametric Cost Function Approximation for Multistage Stochastic Programming

The most common approaches for solving stochastic resource allocation problems in the research literature is to either use value functions (“dynamic programming”) or scenario trees (“stochastic programming”) to approximate the impact of a decision now on the future. By contrast, common industry practice is to use a deterministic approximation of the future which is easier … Read more