A time aggregation approach to Markov decision processes.
Xi-Ren CaoZhiyuan RenShalabh BhatnagarMichael C. FuSteven I. MarcusPublished in: Autom. (2002)
Keyphrases
- function approximation
- markov decision processes
- reinforcement learning
- state space
- optimal policy
- policy iteration
- finite state
- reachability analysis
- planning under uncertainty
- dynamic programming
- reinforcement learning algorithms
- partially observable
- decision theoretic planning
- model based reinforcement learning
- state and action spaces
- risk sensitive
- finite horizon
- infinite horizon
- average reward
- action space
- transition matrices
- learning algorithm
- reward function
- total reward
- real time dynamic programming
- markov decision process
- average cost