Login / Signup
Sample Complexity of Policy-Based Methods under Off-Policy Sampling and Linear Function Approximation.
Zaiwei Chen
Siva Theja Maguluri
Published in:
CoRR (2022)
Keyphrases
</>
function approximation
function approximators
sample complexity
reinforcement learning
temporal difference learning algorithms
learning algorithm
objective function
lower bound
artificial neural networks
sample size