Is the Policy Gradient a Gradient?

Chris Nota Philip S. Thomas

Published in: CoRR (2019)

Keyphrases

policy gradient
gradient method
actor critic
reinforcement learning
function approximation
parametric optimization
optimal control
model free reinforcement learning
reinforcement learning algorithms
approximation methods
variance reduction
average reward
partially observable markov decision processes
reinforcement learning methods
dynamic programming
state action
monte carlo