Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning.
Jakub Grudzien KubaRuiqing ChenMuning WenYing WenFanglei SunJun WangYaodong YangPublished in: CoRR (2021)
Keyphrases
- multi agent reinforcement learning
- trust region
- global optimum
- optimization methods
- column generation
- multi agent learning
- multi agent
- newton method
- reinforcement learning
- log likelihood
- optimal policy
- stochastic games
- learning agents
- multi agent systems
- genetic algorithm
- action selection
- hessian matrix
- simulated annealing
- fuzzy logic
- search space
- cooperative
- branch and bound
- levenberg marquardt
- artificial neural networks
- search algorithm
- objective function
- artificial intelligence