Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees.
Siliang ZengTianyi ChenAlfredo GarciaMingyi HongPublished in: L4DC (2022)
Keyphrases
- actor critic
- learning algorithm
- convergence proof
- reinforcement learning
- gradient method
- optimal solution
- search space
- policy gradient
- multi agent systems
- objective function
- temporal difference
- simulated annealing
- dynamic programming
- computational complexity
- linear programming
- mathematical model
- model free
- action selection
- cost function
- temporal difference learning
- active learning
- approximate dynamic programming
- cooperative