Login / Signup
Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients.
Bozhidar Vasilev
Tarun Gupta
Bei Peng
Shimon Whiteson
Published in:
CoRR (2021)
Keyphrases
</>
multi agent
cost effective
optimal policy
policy making
data sets
cooperative
decision trees
test set
learning algorithm
training phase
multi agent systems
training set
supervised learning
multiagent systems
training algorithm
infinite horizon
action selection
machine learning