Login / Signup
Taylor TD-learning.
Michele Garibbo
Maxime Robeyns
Laurence Aitchison
Published in:
NeurIPS (2023)
Keyphrases
</>
td learning
temporal difference
evaluation function
function approximation
reinforcement learning
multi step
monte carlo
reinforcement learning algorithms
policy evaluation
step size
model free
neural network
policy iteration