Login / Signup
Classical Policy Gradient: Preserving Bellman's Principle of Optimality.
Philip S. Thomas
Scott M. Jordan
Yash Chandak
Chris Nota
James Kostas
Published in:
CoRR (2019)
Keyphrases
</>
policy gradient
actor critic
average reward
reinforcement learning
state action
optimal control
model free reinforcement learning
machine learning
optimal solution
search space
optimal policy
function approximation
approximation methods
gradient method