A Reinforcement Learning Based Algorithm for Finite Horizon Markov Decision Processes.
Shalabh BhatnagarMohammed Shahid AbdullaPublished in: CDC (2006)
Keyphrases
- markov decision processes
- finite horizon
- reinforcement learning
- dynamic programming
- optimal policy
- policy iteration
- model based reinforcement learning
- average reward
- state space
- infinite horizon
- learning algorithm
- expected reward
- markov decision process
- reinforcement learning algorithms
- state abstraction
- model free
- partially observable
- decision theoretic planning
- finite state
- average cost
- control policies
- search space
- initial state
- machine learning
- state variables
- np hard
- action sets
- optimal solution
- real time dynamic programming