Differential TD Learning for Value Function Approximation.

Adithya M. Devraj Sean P. Meyn

Published in: CoRR (2016)

Keyphrases

temporal difference
td learning
reinforcement learning
evaluation function
function approximation
monte carlo
action selection
model free
reinforcement learning algorithms
policy evaluation
step size
function approximators
decision making
state action
neural network
knn
state space
policy iteration