Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning.
Jakub Grudzien KubaRuiqing ChenMuning WenYing WenFanglei SunJun WangYaodong YangPublished in: ICLR (2022)
Keyphrases
- multi agent reinforcement learning
- trust region
- column generation
- global optimum
- multi agent
- optimization methods
- reinforcement learning
- optimal policy
- multi agent systems
- multi agent learning
- newton method
- log likelihood
- genetic algorithm
- hessian matrix
- learning agents
- linear programming
- stochastic games
- cooperative
- action selection
- line search
- levenberg marquardt
- support vector