The control of a two-level Markov decision process by time aggregation.

Yat-wah Wan Xi-Ren Cao

Published in: Autom. (2006)

Keyphrases

markov decision process
state space
control system
optimal policy
markov decision processes
reinforcement learning
infinite horizon
temporal difference learning
finite horizon
machine learning
control problems
initial state
transition matrices