Convergence Rates of Average-Reward Multi-agent Reinforcement Learning via Randomized Linear Programming.
Alec KoppelAmrit Singh BediBhargav GangulyVaneet AggarwalPublished in: CDC (2022)
Keyphrases
- multi agent reinforcement learning
- average reward
- convergence rate
- stochastic games
- linear programming
- policy iteration
- primal dual
- reinforcement learning
- markov decision processes
- optimal policy
- long run
- convergence speed
- dynamic programming
- linear program
- step size
- learning rate
- markov chain
- model free
- gradient method
- optimal solution
- multi agent
- objective function
- policy gradient
- np hard
- state space
- multi agent learning
- least squares
- rl algorithms
- neural network
- reward function
- finite state
- function approximation
- decision problems