Login / Signup
On the Convergence and Optimality of Policy Gradient for Markov Coherent Risk.
Audrey Huang
Liu Leqi
Zachary C. Lipton
Kamyar Azizzadenesheli
Published in:
CoRR (2021)
Keyphrases
</>
policy gradient
reinforcement learning
average reward
function approximation
convergence rate
model free reinforcement learning
actor critic
gradient method
natural gradient
optimal solution
markov model
optimal control
convergence speed
markov chain
variance reduction
approximation methods
single agent
long run