td learning – Optimization Online

Temporal difference learning with kernels for pricing american-style options

Published: 2005/05/19, Updated: 2005/06/09

Dynamic Programming, Stochastic Programming kernels, robbins-monro algorithm, td learning

We propose in this paper to study the problem of estimating the cost-to-go function for an infinite-horizon discounted Markov chain with possibly continuous state space. For implementation purposes, the state space is typically discretized. As soon as the dimension of the state space becomes large, the computation is no more practicable, a phenomenon referred to … Read more