Temporal difference learning to detect unsafe system states.

Huazhong Ning Wei Xu Yue Zhou Yihong Gong Thomas S. Huang

Published in: ICPR (2008)

Keyphrases

temporal difference learning
fixed point
function approximation
evaluation function
reinforcement learning
game playing
approximate value iteration
temporal difference
markov decision process
monte carlo
active learning
reinforcement learning algorithms
policy iteration