Policy Gradient Methods for Reinforcement Learning with Function Approximation.

Richard S. Sutton David A. McAllester Satinder P. Singh Yishay Mansour

Published in: NIPS (1999)

Keyphrases

function approximation
policy gradient methods
natural actor critic
reinforcement learning
policy gradient
function approximators
actor critic
temporal difference
reinforcement learning problems
reinforcement learning algorithms
model free
state space
reinforcement learning methods
temporal difference learning
control problems
learning tasks
learning algorithm
multi agent
optimal policy
neuro fuzzy
radial basis function
approximation methods
markov decision processes
dynamic programming
transfer learning
supervised learning
semi supervised
machine learning