A Learning Scheme for Approachability in MDPs and Stackelberg Stochastic Games.
Dileep M. KalathilVivek S. BorkarRahul JainPublished in: CoRR (2014)
Keyphrases
- learning scheme
- stochastic games
- markov decision processes
- nash equilibria
- average reward
- nash equilibrium
- learning algorithm
- multiagent reinforcement learning
- game theory
- finite state
- reinforcement learning
- policy iteration
- optimal policy
- dynamic programming
- state space
- reinforcement learning algorithms
- incomplete information
- average cost
- finite horizon
- infinite horizon
- rule learning
- partially observable
- machine learning
- long run
- game theoretic
- reward function
- neural network
- cooperative
- decision problems