Login / Signup

When Do Off-Policy and On-Policy Policy Gradient Methods Align?

Davide MambelliStephan BongersOnno ZoeterMatthijs T. J. SpaanFrans A. Oliehoek
Published in: CoRR (2024)
Keyphrases
  • policy gradient methods
  • natural actor critic
  • policy gradient
  • robot arm
  • actor critic
  • reinforcement learning
  • reinforcement learning problems
  • gradient method
  • machine learning
  • function approximation