Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies.

Ilyas Fatkhullin Anas Barakat Anastasya Kireeva Niao He

Published in: CoRR (2023)

Keyphrases

sample complexity
policy gradient methods
natural actor critic
theoretical analysis
learning algorithm
active learning
special case
policy gradient
learning problems
robot arm
upper bound
lower bound
generalization error
training examples
sample size
supervised learning
monte carlo
learning tasks
optimal policy
support vector
semi supervised
sufficient conditions
state space
dynamic programming
np hard
reinforcement learning