Sample Complexity of Policy-Based Methods under Off-Policy Sampling and Linear Function Approximation.

Zaiwei Chen Siva Theja Maguluri

Published in: CoRR (2022)

Keyphrases

function approximation
function approximators
sample complexity
reinforcement learning
temporal difference learning algorithms
learning algorithm
objective function
lower bound
artificial neural networks
sample size