A sampled fictitious play based learning algorithm for infinite horizon Markov decision processes.
Esra SisikogluMarina A. EpelmanRobert L. SmithPublished in: WSC (2011)
Keyphrases
- infinite horizon
- markov decision processes
- fictitious play
- learning algorithm
- reinforcement learning algorithms
- reinforcement learning
- optimal policy
- finite horizon
- game theory
- state space
- nash equilibria
- policy iteration
- finite state
- dynamic programming
- average cost
- partially observable
- markov decision process
- nash equilibrium
- average reward
- planning under uncertainty
- reward function
- long run
- decision problems
- markov decision problems
- function approximation
- action space
- learning rate
- inventory level
- lost sales
- machine learning