Login / Signup

The Reinforce Policy Gradient Algorithm Revisited.

Shalabh Bhatnagar
Published in: CoRR (2023)
Keyphrases
  • dynamic programming
  • search space
  • np hard
  • policy gradient
  • learning algorithm
  • objective function
  • k means
  • cost function
  • gradient ascent
  • neural network
  • computational complexity
  • optimal control