Sign in
On the Convergence of Policy Gradient in Robust MDPs.
Qiuhao Wang
Chin Pang Ho
Marek Petrik
Published in:
CoRR (2022)
Keyphrases
</>
policy gradient
reinforcement learning
policy search
markov decision processes
computational complexity
reinforcement learning algorithms
average reward
gradient method