Convergence Rates of Asynchronous Policy Iteration for Zero-Sum Markov Games under Stochastic and Optimistic Settings.
Sarnaduti BrahmaYitao BaiDuy Anh DoThinh T. DoanPublished in: CDC (2022)
Keyphrases
- markov games
- policy iteration
- convergence rate
- markov decision processes
- sample path
- markov decision process
- reinforcement learning algorithms
- markov decision problems
- reinforcement learning
- state space
- convergence speed
- step size
- finite state
- dynamic programming
- average reward
- optimal policy
- control problems
- multiagent reinforcement learning
- infinite horizon
- stochastic games
- average cost
- temporal difference
- monte carlo
- initial state
- nash equilibrium
- partially observable
- partially observable markov decision processes
- finite horizon
- decision processes