Identification of optimal policies in Markov decision processes.
Karel SladkýPublished in: Kybernetika (2010)
Keyphrases
- markov decision processes
- optimal policy
- finite state
- dynamic programming
- state space
- finite horizon
- policy iteration
- reinforcement learning
- average reward
- infinite horizon
- decision problems
- average cost
- transition matrices
- markov decision process
- multistage
- long run
- reinforcement learning algorithms
- partially observable
- sufficient conditions
- state dependent
- semi markov decision processes
- control policies
- decision processes
- data mining
- action space
- partially observable markov decision processes
- real time dynamic programming
- discount factor
- total reward
- state abstraction
- initial state
- reward function
- machine learning