Robust temporal difference learning for critical domains.

Richard Klíma Daan Bloembergen Michael Kaisers Karl Tuyls

Published in: CoRR (2019)

Keyphrases

temporal difference learning
function approximation
fixed point
evaluation function
approximate value iteration
temporal difference
game playing
reinforcement learning
markov decision process
optimal policy