Reducing Sampling Error in Batch Temporal Difference Learning.

Brahma S. Pavse Ishan Durugkar Josiah Hanna Peter Stone

Published in: ICML (2020)

Keyphrases

temporal difference learning
function approximation
fixed point
reinforcement learning
monte carlo
temporal difference
evaluation function
game playing
approximate value iteration
random sampling
reinforcement learning algorithms
markov decision process
dynamic programming