Sample Complexity of Policy-Based Methods under Off-Policy Sampling and Linear Function Approximation.

Zaiwei Chen Siva Theja Maguluri

Published in: AISTATS (2022)

Keyphrases

function approximation
function approximators
sample complexity
radial basis function
temporal difference learning algorithms
reinforcement learning problems
reinforcement learning
temporal difference
model selection
cross validation
neural network
markov decision processes
reinforcement learning algorithms
model free
learning problems
learning experience
collaborative filtering
state space
dynamic programming
feature extraction
machine learning