Login / Signup
Policy Gradient in Robust MDPs with Global Convergence Guarantee.
Qiuhao Wang
Chin Pang Ho
Marek Petrik
Published in:
ICML (2023)
Keyphrases
</>
global convergence
policy gradient
convergence rate
global optimum
reinforcement learning
markov decision processes
convergence speed
optimization methods
policy search
average reward
approximation methods
multi agent systems
state space
optimization method
gradient method