On Markov Games with Average Reward Criterion and Weakly Continuous Transition Probabilities.
Heinz-Uwe KüenlePublished in: SIAM J. Control. Optim. (2007)
Keyphrases
- transition probabilities
- average reward
- markov chain
- reward function
- markov decision processes
- reinforcement learning algorithms
- optimal policy
- random walk
- markov models
- state space
- policy iteration
- markov decision process
- stochastic games
- monte carlo
- finite state
- reinforcement learning
- long run
- markov decision problems
- action space
- model free
- markov model
- temporal difference
- hidden variables
- data mining
- decision problems
- dynamical systems
- machine learning