Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines.

Philip S. Thomas Emma Brunskill

Published in: CoRR (2017)

Keyphrases

function approximation
natural actor critic
reinforcement learning
policy gradient methods
function approximators
policy gradient
actor critic
state action
temporal difference
action selection
temporal difference learning
reinforcement learning algorithms
reinforcement learning problems
action space
radial basis function
model free
reinforcement learning methods
learning tasks
approximation methods
state space
learning problems
policy search
markov decision processes
machine learning
transfer learning