Multi-Bellman operator for convergence of Q-learning with linear function approximation.

Diogo S. Carvalho Pedro A. Santos Francisco S. Melo

Published in: CoRR (2023)

Keyphrases

function approximation
temporal difference learning algorithms
temporal difference learning
function approximators
reinforcement learning
tile coding
model free
temporal difference
learning tasks
radial basis function
state action space
state action
linear program
actor critic
convergence speed
reinforcement learning algorithms
learning styles
support vector machine
pattern recognition