Login / Signup
TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?
Joshua Romoff
Peter Henderson
David Kanaa
Emmanuel Bengio
Ahmed Touati
Pierre-Luc Bacon
Joelle Pineau
Published in:
CoRR (2020)
Keyphrases
</>
temporal difference learning
fixed point
function approximation
reinforcement learning
approximate value iteration
game playing
evaluation function
temporal difference
reinforcement learning algorithms
markov decision process
monte carlo
gaussian process
policy iteration
function approximators
neural network
state space
sufficient conditions
artificial neural networks
objective function