Login / Signup
The Reinforce Policy Gradient Algorithm Revisited.
Shalabh Bhatnagar
Published in:
CoRR (2023)
Keyphrases
</>
dynamic programming
search space
np hard
policy gradient
learning algorithm
objective function
k means
cost function
gradient ascent
neural network
computational complexity
optimal control