Jean-Sebastien Roy – Optimization Online

A Q-Learning Algorithm with Continuous State Space

Published: 2006/09/23

Dynamic Programming, Stochastic Programming continuous state space, kernels, q-learning

We study in this paper a Markov Decision Problem (MDP) with continuous state space and discrete decision variables. We propose an extension of the Q-learning algorithm introduced to solve this problem by Watkins in 1989 for completely discrete MDPs. Our algorithm relies on stochastic approximation and functional estimation, and uses kernels to locally update the … Read more

Temporal difference learning with kernels for pricing american-style options

Published: 2005/05/19, Updated: 2005/06/09

Dynamic Programming, Stochastic Programming kernels, robbins-monro algorithm, td learning

We propose in this paper to study the problem of estimating the cost-to-go function for an infinite-horizon discounted Markov chain with possibly continuous state space. For implementation purposes, the state space is typically discretized. As soon as the dimension of the state space becomes large, the computation is no more practicable, a phenomenon referred to … Read more

A Perturbed Gradient Algorithm in Hilbert Spaces

Published: 2005/03/17, Updated: 2005/05/19

Applications - OR and Management Sciences, Convex Optimization, Stochastic Programming infinite dimen-, perturbed gradient, stochastic quasi-gradient

We propose a perturbed gradient algorithm with stochastic noises to solve a general class of optimization problems. We provide a convergence proof for this algorithm, under classical assumptions on the descent direction, and new assumptions on the stochastic noises. Instead of requiring the stochastic noises to correspond to martingale increments, we only require these noises … Read more