Login / Signup
Cooperative Multi-Agent Policy Gradients with Sub-optimal Demonstration.
Peixi Peng
Junliang Xing
Lu Pang
Published in:
CoRR (2018)
Keyphrases
</>
cooperative multi agent
asymptotically optimal
dynamic programming
optimal policy
expected cost
worst case
optimal solution
information technology
object oriented
general purpose
data sets
closed form
infinite horizon
high level
markov decision process
policy makers
state dependent
neural network