Login / Signup
The control of a two-level Markov decision process by time aggregation.
Yat-wah Wan
Xi-Ren Cao
Published in:
Autom. (2006)
Keyphrases
</>
markov decision process
state space
control system
optimal policy
markov decision processes
reinforcement learning
infinite horizon
temporal difference learning
finite horizon
machine learning
control problems
initial state
transition matrices