Rates of Convergence of Performance Gradient Estimates Using Function Approximation and Bias in Reinforcement Learning.

Gregory Z. Grudic Lyle H. Ungar

Published in: NIPS (2001)

Keyphrases

function approximation
rates of convergence
reinforcement learning
policy gradient
temporal difference
temporal difference learning
regression function
model free
function approximators
learning tasks
state space
decision boundary
radial basis function
reinforcement learning algorithms
learning algorithm
expectation maximization
markov decision processes
gradient method
learning experience
supervised learning
semi supervised
data sets