Login / Signup
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift.
Alekh Agarwal
Sham M. Kakade
Jason D. Lee
Gaurav Mahajan
Published in:
J. Mach. Learn. Res. (2021)
Keyphrases
</>
policy gradient methods
neural network
machine learning
optimal solution
probability distribution
utility function