Login / Signup
When Do Off-Policy and On-Policy Policy Gradient Methods Align?
Davide Mambelli
Stephan Bongers
Onno Zoeter
Matthijs T. J. Spaan
Frans A. Oliehoek
Published in:
CoRR (2024)
Keyphrases
</>
policy gradient methods
natural actor critic
policy gradient
robot arm
actor critic
reinforcement learning
reinforcement learning problems
gradient method
machine learning
function approximation