Reducing Sampling Error in Batch Temporal Difference Learning.

Brahma S. Pavse Ishan Durugkar Josiah Hanna Peter Stone

Published in: CoRR (2020)

Keyphrases

temporal difference learning
fixed point
function approximation
evaluation function
monte carlo
game playing
reinforcement learning
temporal difference
approximate value iteration
markov decision process
random sampling
neural network
search space
dynamic programming
graph cuts
learning tasks
decision making
machine learning