Using Control Theory for Analysis of Reinforcement Learning and Optimal Policy Properties in Grid-World Problems.
S. Mostapha Kalami HerisMohammad-Bagher Naghibi-SistaniNaser ParizPublished in: ICIC (2) (2009)
Keyphrases
- optimal policy
- reinforcement learning
- control theory
- decision problems
- state space
- markov decision processes
- markov decision process
- dynamic programming
- multistage
- partially observable markov decision processes
- state dependent
- finite horizon
- infinite horizon
- control policies
- machine learning
- dynamical systems
- reward function
- sufficient conditions
- np hard
- reinforcement learning methods
- learning algorithm