A new convergent variant of Q-learning with linear function approximation.

Diogo Carvalho Francisco S. Melo Pedro Santos

Published in: NeurIPS (2020)

Keyphrases

function approximation
temporal difference learning algorithms
reinforcement learning
function approximators
tile coding
state action space
temporal difference
radial basis function
temporal difference learning
learning tasks
model free
td learning
reinforcement learning algorithms
reinforcement learning problems
least squares
real valued
pattern recognition
mountain car
machine learning