Uniform-PAC Bounds for Reinforcement Learning with Linear Function Approximation.

Jiafan He Dongruo Zhou Quanquan Gu

Published in: CoRR (2021)

Keyphrases

function approximation
reinforcement learning
function approximators
temporal difference learning algorithms
upper bound
temporal difference learning
temporal difference
vc dimension
learning tasks
model free
state space
multi agent
radial basis function
reinforcement learning algorithms
neural network
temporal difference methods
learning algorithm
machine learning
sample size
mountain car
sample complexity
reinforcement learning methods
td learning
optimal policy
learning process