Adversarial multi-armed bandit approach to two-person zero-sum Markov games.
Hyeong Soo ChangMichael C. FuSteven I. MarcusPublished in: CDC (2007)
Keyphrases
- markov games
- multi armed bandit
- reinforcement learning
- markov decision processes
- reinforcement learning algorithms
- multiagent reinforcement learning
- markov decision process
- multi agent
- control problems
- multi armed bandits
- stochastic games
- multiagent systems
- state space
- function approximation
- cooperative
- model free
- learning algorithm
- nash equilibrium
- optimal policy
- dynamic programming
- temporal difference
- decision making
- action selection
- np hard
- learning process
- multi agent systems