Provable benefits of general coverage conditions in efficient online RL with function approximation.

Fanghui Liu Luca Viano Volkan Cevher

Published in: CoRR (2023)

Keyphrases

function approximation
reinforcement learning
tile coding
temporal difference learning
temporal difference
model free
learning tasks
radial basis function
temporal difference learning algorithms
reinforcement learning algorithms
exploration exploitation tradeoff
neural network
function approximators
td learning
learning environment
state space
dynamic programming
artificial neural networks