A general sample complexity analysis of vanilla policy gradient.

Rui Yuan Robert M. Gower Alessandro Lazaric

Published in: CoRR (2021)

Keyphrases

complexity analysis
policy gradient
special case
theoretical analysis
reinforcement learning
lower bound
sample size
neural network
first order logic
function approximation