Login / Signup
Policy-Gradient Algorithms Have No Guarantees of Convergence in Continuous Action and State Multi-Agent Settings.
Eric Mazumdar
Lillian J. Ratliff
Michael I. Jordan
S. Shankar Sastry
Published in:
CoRR (2019)
Keyphrases
</>
multi agent
policy search
learning algorithm
convergence rate
dynamic programming
partially observable markov decision processes
continuous action
computational complexity
probability distribution
optimization methods
optimal control
state dependent
policy gradient