Login / Signup
Temporal-difference learning for nonlinear value function approximation in the lazy training regime.
Andrea Agazzi
Jianfeng Lu
Published in:
CoRR (2019)
Keyphrases
</>
temporal difference learning
fixed point
function approximation
temporal difference
reinforcement learning
evaluation function
game playing
approximate value iteration
monte carlo
reinforcement learning algorithms
markov decision process
training set
supervised learning
bayesian networks