Gap-Dependent Bounds for Two-Player Markov Games.
Zehao DouZhuoran YangZhaoran WangSimon S. DuPublished in: CoRR (2021)
Keyphrases
- markov games
- reinforcement learning algorithms
- multiagent reinforcement learning
- markov decision processes
- nash equilibrium
- reinforcement learning
- stochastic games
- state space
- upper bound
- model free
- lower bound
- learning algorithm
- temporal difference learning
- worst case
- markov decision process
- function approximation
- temporal difference
- control problems
- dynamic programming
- reward function
- multi agent
- np hard
- dynamic environments
- linear programming